Skip to content

perf: add pooling for contexts#167

Closed
KowalskiThomas wants to merge 3 commits into
1.xfrom
kowalski/perf-reuse-contexts
Closed

perf: add pooling for contexts#167
KowalskiThomas wants to merge 3 commits into
1.xfrom
kowalski/perf-reuse-contexts

Conversation

@KowalskiThomas

@KowalskiThomas KowalskiThomas commented Jun 19, 2026

Copy link
Copy Markdown

What is this PR?

This PR is a proposal to make compression and decompression of single payloads faster across the board by pooling compression and decompression contexts instead of constantly creating new ones and discarding after operation.

The implementation is simple:

  • We add two new cctxWrapper and dctxWrapper structs as well as two sync.Pool's to manage them.
  • We use ZSTD_compressCCtx (respectively ZSTD_decompressDCtx) instead of ZSTD_compress (respectively ZSTD_decompress), passing a pooled context object into them.

As far as I can tell, this change doesn't really come with a significant trade-off. We effectively delegate memory management for contexts to the two Pool's, but I expect this to be a non issue in the vast majority of cases: an application that would repeatedly compress/decompress payloads would in practice constantly be allocating/freeing contexts (leading to probably higher memory usage due to objects waiting to be GC'd, and higher fragmentation). On the other hand, an application that makes a "burst" of calls to compression/decompression functions would potentially allocate a lot of contexts at once, but sync.Pool (eventually) automatically frees unused objects during GC.

Testing

I ran the benchmarks I created and temporarily added for that optimisation (see commit history). The results are as follows.

macOS / MacBook Pro M4 Max

Benchmark Before After Improvement
Compress 1K 4425 ns 4120 ns ~7%
Compress 8K 18900 ns 16830 ns ~11%
Compress 64K 149900 ns 148900 ns ~1%
Decompress 1K 1983 ns 1862 ns ~6%
Decompress 8K 3908 ns 3747 ns ~4%
Decompress 64K 23748 ns 22919 ns ~3.5%
Compress 8K (parallel) 1551 ns 1404 ns ~9.5%

Linux / DoE workspace

Benchmark Before After Improvement
Compress 1K 18900 ns 11420 ns ~40%
Compress 8K 62020 ns 52810 ns ~15%
Compress 64K 463900 ns 446900 ns ~3.7%
Decompress 1K 10251 ns 3193 ns ~69%
Decompress 8K 14533 ns 7330 ns ~50%
Decompress 64K 53790 ns 46770 ns ~13%
Compress 8K (parallel) 15690 ns 13410 ns ~15%

@datadog-prod-us1-6

This comment has been minimized.

@KowalskiThomas KowalskiThomas force-pushed the kowalski/perf-reuse-contexts branch from bd5bca4 to 1cd20a8 Compare June 19, 2026 08:13
@KowalskiThomas KowalskiThomas force-pushed the kowalski/perf-reuse-contexts branch from 1cd20a8 to 30f7a4e Compare June 19, 2026 08:19
@KowalskiThomas KowalskiThomas changed the title perf: reuse contexts perf: add pooling for contexts Jun 19, 2026
@KowalskiThomas KowalskiThomas marked this pull request as ready for review June 19, 2026 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant