Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SIGSEGV with zstd compression enabled #1164

Merged
merged 5 commits into from Feb 1, 2024

Conversation

RobertIndie
Copy link
Member

@RobertIndie RobertIndie commented Jan 31, 2024

Fixes #1163

Thanks @0x4500 for reporting this issue and providing the analysis.

Motivation

#1121 introduces a regression bug. The compression logic has been moved from internalSend to internalSendAsync which leads to the compression being executed concurrently.

However, zstd_cgo doesn't support concurrent compression of data. To prevent this, we need to introduce a mutex.

Modifications

  • Add mutex for zstd_cgo

Verifying this change

I have run the benchmark and shows that this change has a minimal impact on performance.

Before this PR(I have also provided other compression provider benchmark for reference.):

BenchmarkCompression
BenchmarkCompression/zlib
BenchmarkCompression/zlib-12         	     388	   2904783 ns/op	  34.63 MB/s
BenchmarkCompression/lz4
BenchmarkCompression/lz4-12          	    4832	    239997 ns/op	 419.20 MB/s
BenchmarkCompression/zstd-pure-go-fastest
BenchmarkCompression/zstd-pure-go-fastest-12         	    2618	    428303 ns/op	 234.89 MB/s
BenchmarkCompression/zstd-pure-go-default
BenchmarkCompression/zstd-pure-go-default-12         	    1794	    633533 ns/op	 158.80 MB/s
BenchmarkCompression/zstd-pure-go-best
BenchmarkCompression/zstd-pure-go-best-12            	    1299	    905872 ns/op	 111.06 MB/s
BenchmarkCompression/zstd-cgo-level-fastest
BenchmarkCompression/zstd-cgo-level-fastest-12       	    7480	    176366 ns/op	 570.44 MB/s
BenchmarkCompression/zstd-cgo-level-default
BenchmarkCompression/zstd-cgo-level-default-12       	    1867	    587183 ns/op	 171.34 MB/s
BenchmarkCompression/zstd-cgo-level-best
BenchmarkCompression/zstd-cgo-level-best-12          	     620	   1855183 ns/op	  54.23 MB/s
PASS

After this PR:

BenchmarkCompression
BenchmarkCompression/zlib
BenchmarkCompression/zlib-12         	     351	   2974837 ns/op	  33.82 MB/s
BenchmarkCompression/lz4
BenchmarkCompression/lz4-12          	    4575	    223056 ns/op	 451.03 MB/s
BenchmarkCompression/zstd-pure-go-fastest
BenchmarkCompression/zstd-pure-go-fastest-12         	    2474	    412231 ns/op	 244.05 MB/s
BenchmarkCompression/zstd-pure-go-default
BenchmarkCompression/zstd-pure-go-default-12         	    1666	    641311 ns/op	 156.88 MB/s
BenchmarkCompression/zstd-pure-go-best
BenchmarkCompression/zstd-pure-go-best-12            	    1228	    882828 ns/op	 113.96 MB/s
BenchmarkCompression/zstd-cgo-level-fastest
BenchmarkCompression/zstd-cgo-level-fastest-12       	    6919	    176977 ns/op	 568.47 MB/s
BenchmarkCompression/zstd-cgo-level-default
BenchmarkCompression/zstd-cgo-level-default-12       	    1855	    586434 ns/op	 171.56 MB/s
BenchmarkCompression/zstd-cgo-level-best
BenchmarkCompression/zstd-cgo-level-best-12          	     667	   1799175 ns/op	  55.92 MB/s
PASS

This change added tests.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API: (yes / no)
  • The schema: (yes / no / don't know)
  • The default values of configurations: (yes / no)
  • The wire protocol: (yes / no)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / GoDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

@0x4500
Copy link

0x4500 commented Jan 31, 2024

Confirmed with a manual test that this PR fixes the issue for us. Thanks @RobertIndie!

@@ -40,7 +40,7 @@ func TimestampMillis(t time.Time) uint64 {
// GetAndAdd perform atomic read and update
func GetAndAdd(n *uint64, diff uint64) uint64 {
for {
v := *n
v := atomic.LoadUint64(n)
if atomic.CompareAndSwapUint64(n, v, v+diff) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar enough with the pulsar codebase to understand what this is trying to achieve, but the usual way to perform an atomic get-and-add on a uint64 in golang is to use atomic.AddUint64(). Also, is this change related to the issue at hand, or is it an unrelated change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is to fix a race issue in the CI discovered by the new test case TestSendConcurrently.
The difference between them is that atomic.AddUint64() returns the new value but GetAndAdd here returns the old value.
Actually I was trying to use atomic.AddUint64(), but if failed in the CI because there are other cases that need the old values.
And yes, this is to fix another regression bug introduced in 0.12.0.

@0x4500
Copy link

0x4500 commented Feb 1, 2024

The updated version works for us -- thanks again @RobertIndie.

@merlimat merlimat merged commit 8776135 into apache:master Feb 1, 2024
6 checks passed
RobertIndie added a commit that referenced this pull request Feb 2, 2024
* Fix SIGSEGV with zstd compression enabled

* Use sync.Pool to cache zstd ctx

* Fix race in sequenceID assignment

* Fix GetAndAdd

(cherry picked from commit 8776135)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SIGSEGV in 0.12.0 with zstd compression enabled, when producer is shared between multiple goroutines
4 participants