Update upperBound ratio when guessing the required decompression buffer size #141

sfluor · 2024-06-26T13:11:33Z

We noticed that for one of our services we sometimes have a lot of allocations caused by the ioutil.ReadAll call in the Decompress method.

After investigation those are coming from buffers whose decompressed size is greater than 10x the input size. This is quite wasteful because it means that we allocate twice for those buffers.

Once in this branch:

zstd/zstd.go

Line 143 in 869dae0

dst = make([]byte, bound)

And another time here once we realise that the buffer we previously allocated wasn't big enough:

zstd/zstd.go

Line 154 in 869dae0

// We failed getting a dst buffer of correct size, use stream API

We have this limit to avoid malicious payloads but we noticed that this actually triggered quite often for one of our services whose compression rates are above 10x.

This technically doesn't fully solve the problem but it should now happen less often. I can also make this upper bound configurable if this change sounds too scary

…er size We noticed that for one of our services we sometimes have a lot of allocations caused by the `ioutil.ReadAll` call in the `Decompress` method. After investigation those are coming from buffers whose decompressed size is greater than 10x the input size. This is quite wasteful because it means that we allocate twice for those buffers. Once in this branch: https://github.com/DataDog/zstd/blob/869dae002e5efb372a0b09cd7d99390ca2089cc1/zstd.go#L143 And another time here once we realise that the buffer we previously allocated wasn't big enough: https://github.com/DataDog/zstd/blob/869dae002e5efb372a0b09cd7d99390ca2089cc1/zstd.go#L154 This technically doesn't fully solve the problem but it should now happen less often. I can also make this upper bound configurable if this change sounds too scary

ppattard · 2024-06-27T11:05:38Z

zstd.go

-	// 1 MB or 10x input size
-	upperBound := 10 * len(src)
+	// 1 MB or 50x input size
+	upperBound := 50 * len(src)


That's a huge bump no? It means we'll allocate 5x more memory than before in all these cases.

We will in the case where we cannot read the size from the zstd headers (or that the size we read is bigger than 10x the compressed size).

However imo it's better to do that and allocate once rather than first allocate 10xsize, realise it's not enough and reallocate decompressedSize

sfluor force-pushed the sfluor-patch-1 branch from 993fc3f to e75a26a Compare June 26, 2024 13:12

sfluor marked this pull request as ready for review June 26, 2024 13:12

ppattard reviewed Jun 27, 2024

View reviewed changes

ppattard approved these changes Jun 28, 2024

View reviewed changes

sfluor merged commit beb4dfd into 1.x Jun 28, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update upperBound ratio when guessing the required decompression buffer size #141

Update upperBound ratio when guessing the required decompression buffer size #141

sfluor commented Jun 26, 2024 •

edited

Loading

ppattard Jun 27, 2024

sfluor Jun 27, 2024 •

edited

Loading

Update upperBound ratio when guessing the required decompression buffer size #141

Update upperBound ratio when guessing the required decompression buffer size #141

Conversation

sfluor commented Jun 26, 2024 • edited Loading

ppattard Jun 27, 2024

Choose a reason for hiding this comment

sfluor Jun 27, 2024 • edited Loading

Choose a reason for hiding this comment

sfluor commented Jun 26, 2024 •

edited

Loading

sfluor Jun 27, 2024 •

edited

Loading