Fix SIGSEGV with zstd compression enabled #1164

RobertIndie · 2024-01-31T08:55:59Z

Thanks @0x4500 for reporting this issue and providing the analysis.

Motivation

#1121 introduces a regression bug. The compression logic has been moved from internalSend to internalSendAsync which leads to the compression being executed concurrently.

However, zstd_cgo doesn't support concurrent compression of data. To prevent this, we need to introduce a mutex.

Modifications

Add mutex for zstd_cgo

Verifying this change

I have run the benchmark and shows that this change has a minimal impact on performance.

Before this PR(I have also provided other compression provider benchmark for reference.):

BenchmarkCompression
BenchmarkCompression/zlib
BenchmarkCompression/zlib-12         	     388	   2904783 ns/op	  34.63 MB/s
BenchmarkCompression/lz4
BenchmarkCompression/lz4-12          	    4832	    239997 ns/op	 419.20 MB/s
BenchmarkCompression/zstd-pure-go-fastest
BenchmarkCompression/zstd-pure-go-fastest-12         	    2618	    428303 ns/op	 234.89 MB/s
BenchmarkCompression/zstd-pure-go-default
BenchmarkCompression/zstd-pure-go-default-12         	    1794	    633533 ns/op	 158.80 MB/s
BenchmarkCompression/zstd-pure-go-best
BenchmarkCompression/zstd-pure-go-best-12            	    1299	    905872 ns/op	 111.06 MB/s
BenchmarkCompression/zstd-cgo-level-fastest
BenchmarkCompression/zstd-cgo-level-fastest-12       	    7480	    176366 ns/op	 570.44 MB/s
BenchmarkCompression/zstd-cgo-level-default
BenchmarkCompression/zstd-cgo-level-default-12       	    1867	    587183 ns/op	 171.34 MB/s
BenchmarkCompression/zstd-cgo-level-best
BenchmarkCompression/zstd-cgo-level-best-12          	     620	   1855183 ns/op	  54.23 MB/s
PASS

After this PR:

BenchmarkCompression
BenchmarkCompression/zlib
BenchmarkCompression/zlib-12         	     351	   2974837 ns/op	  33.82 MB/s
BenchmarkCompression/lz4
BenchmarkCompression/lz4-12          	    4575	    223056 ns/op	 451.03 MB/s
BenchmarkCompression/zstd-pure-go-fastest
BenchmarkCompression/zstd-pure-go-fastest-12         	    2474	    412231 ns/op	 244.05 MB/s
BenchmarkCompression/zstd-pure-go-default
BenchmarkCompression/zstd-pure-go-default-12         	    1666	    641311 ns/op	 156.88 MB/s
BenchmarkCompression/zstd-pure-go-best
BenchmarkCompression/zstd-pure-go-best-12            	    1228	    882828 ns/op	 113.96 MB/s
BenchmarkCompression/zstd-cgo-level-fastest
BenchmarkCompression/zstd-cgo-level-fastest-12       	    6919	    176977 ns/op	 568.47 MB/s
BenchmarkCompression/zstd-cgo-level-default
BenchmarkCompression/zstd-cgo-level-default-12       	    1855	    586434 ns/op	 171.56 MB/s
BenchmarkCompression/zstd-cgo-level-best
BenchmarkCompression/zstd-cgo-level-best-12          	     667	   1799175 ns/op	  55.92 MB/s
PASS

This change added tests.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

Dependencies (does it add or upgrade a dependency): (yes / no)
The public API: (yes / no)
The schema: (yes / no / don't know)
The default values of configurations: (yes / no)
The wire protocol: (yes / no)

Documentation

Does this pull request introduce a new feature? (yes / no)
If yes, how is the feature documented? (not applicable / docs / GoDocs / not documented)
If a feature is not applicable for documentation, explain why?
If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

0x4500 · 2024-01-31T09:26:40Z

Confirmed with a manual test that this PR fixes the issue for us. Thanks @RobertIndie!

pulsar/internal/compression/zstd_cgo.go

0x4500 · 2024-02-01T12:06:39Z

pulsar/internal/utils.go

@@ -40,7 +40,7 @@ func TimestampMillis(t time.Time) uint64 {
 // GetAndAdd perform atomic read and update
 func GetAndAdd(n *uint64, diff uint64) uint64 {
 	for {
-		v := *n
+		v := atomic.LoadUint64(n)
 		if atomic.CompareAndSwapUint64(n, v, v+diff) {


I'm not familiar enough with the pulsar codebase to understand what this is trying to achieve, but the usual way to perform an atomic get-and-add on a uint64 in golang is to use atomic.AddUint64(). Also, is this change related to the issue at hand, or is it an unrelated change?

This change is to fix a race issue in the CI discovered by the new test case TestSendConcurrently.
The difference between them is that atomic.AddUint64() returns the new value but GetAndAdd here returns the old value.
Actually I was trying to use atomic.AddUint64(), but if failed in the CI because there are other cases that need the old values.
And yes, this is to fix another regression bug introduced in 0.12.0.

0x4500 · 2024-02-01T12:08:21Z

The updated version works for us -- thanks again @RobertIndie.

* Fix SIGSEGV with zstd compression enabled * Use sync.Pool to cache zstd ctx * Fix race in sequenceID assignment * Fix GetAndAdd (cherry picked from commit 8776135)

RobertIndie added 2 commits January 31, 2024 16:47

Fix SIGSEGV with zstd compression enabled

5e6f595

Merge remote-tracking branch 'up/master' into iss-1163

9c5c407

RobertIndie added type/bug release/0.12.1 labels Jan 31, 2024

RobertIndie self-assigned this Jan 31, 2024

RobertIndie added this to the v0.13.0 milestone Jan 31, 2024

BewareMyPower approved these changes Jan 31, 2024

View reviewed changes

merlimat reviewed Jan 31, 2024

View reviewed changes

pulsar/internal/compression/zstd_cgo.go Outdated Show resolved Hide resolved

RobertIndie added 3 commits February 1, 2024 11:25

Use sync.Pool to cache zstd ctx

6ad3854

Fix race in sequenceID assignment

51a62c2

Fix GetAndAdd

da0292b

RobertIndie requested review from merlimat and 0x4500 February 1, 2024 10:00

0x4500 reviewed Feb 1, 2024

View reviewed changes

0x4500 approved these changes Feb 1, 2024

View reviewed changes

merlimat approved these changes Feb 1, 2024

View reviewed changes

merlimat merged commit 8776135 into apache:master Feb 1, 2024
6 checks passed

RobertIndie added the cherry-picked/branch-0.12.0 label Feb 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix SIGSEGV with zstd compression enabled #1164

Fix SIGSEGV with zstd compression enabled #1164

RobertIndie commented Jan 31, 2024 •

edited

0x4500 commented Jan 31, 2024 •

edited

0x4500 Feb 1, 2024

RobertIndie Feb 1, 2024

0x4500 commented Feb 1, 2024

Fix SIGSEGV with zstd compression enabled #1164

Fix SIGSEGV with zstd compression enabled #1164

Conversation

RobertIndie commented Jan 31, 2024 • edited

Motivation

Modifications

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

0x4500 commented Jan 31, 2024 • edited

0x4500 Feb 1, 2024

Choose a reason for hiding this comment

RobertIndie Feb 1, 2024

Choose a reason for hiding this comment

0x4500 commented Feb 1, 2024

RobertIndie commented Jan 31, 2024 •

edited

0x4500 commented Jan 31, 2024 •

edited