Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch writePump performance hit #613

Merged
merged 7 commits into from
Sep 22, 2022

Conversation

denopink
Copy link
Contributor

@denopink denopink commented Jul 28, 2022

@johnabass asked me to look into patching the performance hit after patching the writePump panic caused by the same encoder in #595
Looks like this block of code in writePump was using the nil after reseting the encoder, but it wasn't accomplishing anything with that call. The expected behavior was that a nil would reset the buffer to an empty []byte.

encoder.ResetBytes(&frameContents)
writeError = encoder.Encode(envelope.request.Message)
encoder.ResetBytes(nil) // Upto v1.2.3 it did nothing, v1.2.4 and newer causes a panic

Actually, looking through ugorji's codec releases, it looks like this wouldn't have done anything upto v1.2.3

// v1.2.3 and older `encoder.ResetBytes(nil)` does nothing
// ResetBytes resets the Encoder with a new destination output []byte.
func (e *Encoder) ResetBytes(out *[]byte) {
	if out == nil {
		return
	}
	var in []byte = *out
	if in == nil {
		in = make([]byte, defEncByteBufSize)
	}
	e.bytes = true
	e.wb.reset(in, out)
	e.resetCommon()
}

Releases v1.2.4 and newer would have started the panic scenario

// v1.2.4 causes writepump panic with `encoder.ResetBytes(nil)`
// ResetBytes resets the Encoder with a new destination output []byte.
func (e *Encoder) ResetBytes(out *[]byte) {
	var in []byte = *out
	if in == nil {
		in = make([]byte, defEncByteBufSize)
	}
	e.bytes = true
	e.wb.reset(in, out)
	e.resetCommon()
}
// v1.2.5 and newer causes writepump panic with `encoder.ResetBytes(nil)`
// ResetBytes resets the Encoder with a new destination output []byte.
func (e *Encoder) ResetBytes(out *[]byte) {
	e.bytes = true
	e.wb.reset(encInBytes(out), out)
	e.resetCommon()
}

What change in the ugorji/go/codec ResetBytes is that out buffers can no longer be nil's. Specifically, encoder.ResetBytes(nil) will kick off a panic due to an eventual nil dereferencing caused by ugorji's codec's encInBytes receiving out as a nil.

func encInBytes(out *[]byte) (in []byte) {
	in = *out
	if in == nil {
		in = make([]byte, defEncByteBufSize)
	}
	return
}

The solution is to revert the code and replace encoder.ResetBytes(nil) with encoder.ResetBytes(&emptyBuffer).

How To Test For writePump Panic

Ran the simulator & talaria locally or in one of dev/cd envs

Tested the patch with the following:

curl -v --location --request POST 'localhost:6200/api/v3/device/send' \
--header 'Authorization: Basic ${AUTH}' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--data-raw '{
"msg_type":3,
"content_type":"application/json",
"source":"dns:me",
"dest":"mac:112233445566",
"transaction_uuid":"1234567890",
"payload":"eyJjb21tYW5kIjoiR0VUIiwibmFtZXMiOlsiU29tZXRoaW5nIl19",
"partner_ids":["comcast"]
}'

// Output:

Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying ::1:6200...
* Connected to localhost (::1) port 6200 (#0)
> POST /api/v3/device/send HTTP/1.1
> Host: localhost:6200
> User-Agent: curl/7.77.0
> Authorization: Basic ${AUTH}
> Content-Type: application/json
> Accept: application/json
> Content-Length: 287
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 648
< Content-Type: application/json
< X-Talaria-Build: 0.1.4
< X-Talaria-Flavor: mint
< X-Talaria-Region: east
< X-Talaria-Server: talaria
< X-Talaria-Start-Time: 21 Apr 22 16:23 UTC
< Date: Thu, 21 Apr 2022 16:42:49 GMT
< 
* Connection #0 to host localhost left intact
{"msg_type":3,"source":"mac:112233445566","dest":"dns:me","transaction_uuid":"1234567890","content_type":"application/octet-stream","metadata":{"partner-id":"comcast","hw-serial-number":"mock-rdkb-simulator","hw-manufacturer":"Example","hw-mac":"112233445566","hw-last-reboot-reason":"unknown","fw-name":"mock-rdkb-firmware","boot-time":"1650557221","webpa-last-reconnect-reason":"webpa_process_starts","webpa-protocol":"PARODUS-2.0-1.1.4-6-gad2d43b","hw-model":"aker-testing","webpa-interface-used":"eth0","webpa-uuid":"1234567-345456546"},"payload":"eyJzdGF0dXNDb2RlIjo1MzEsIm1lc3NhZ2UiOiJTZXJ2aWNlIFVuYXZhaWxhYmxlIn0=","partner_ids":["unknown"]}

Looks like this block of code in writePump was using the nil after reseting the encoder, but it wasn't accomplishing anything with that call
```go

encoder.ResetBytes(&frameContents)
writeError = encoder.Encode(envelope.request.Message)
encoder.ResetBytes(nil) // Upto v1.2.3 it did nothing, v1.2.4 and newer causes a panic
```
Actually, looking through ugorji's codec releases, it looks like this wouldn't have done anything upto v1.2.3
```go
// v1.2.3 and older `encoder.ResetBytes(nil)` does nothing
// ResetBytes resets the Encoder with a new destination output []byte.
func (e *Encoder) ResetBytes(out *[]byte) {
	if out == nil {
		return
	}
	var in []byte = *out
	if in == nil {
		in = make([]byte, defEncByteBufSize)
	}
	e.bytes = true
	e.wb.reset(in, out)
	e.resetCommon()
}
```
Releases v1.2.4 and newer would have started the panic scenario
```go
// v1.2.4 causes writepump panic with `encoder.ResetBytes(nil)`
// ResetBytes resets the Encoder with a new destination output []byte.
func (e *Encoder) ResetBytes(out *[]byte) {
	var in []byte = *out
	if in == nil {
		in = make([]byte, defEncByteBufSize)
	}
	e.bytes = true
	e.wb.reset(in, out)
	e.resetCommon()
}
```
```go
// v1.2.5 and newer causes writepump panic with `encoder.ResetBytes(nil)`
// ResetBytes resets the Encoder with a new destination output []byte.
func (e *Encoder) ResetBytes(out *[]byte) {
	e.bytes = true
	e.wb.reset(encInBytes(out), out)
	e.resetCommon()
}
```
What change in the ugorji/go/codec `ResetBytes` is that `out` buffers can no longer be nil's. Specifically, `encoder.ResetBytes(nil)` will kick off a panic due to an eventual nil dereferencing caused by ugorji's [`codec's encInBytes`](https://github.com/ugorji/go/blob/b4c725930670fc2d46721b17f4d6974c66fb50c1/codec/encode.go#L1451-L1452) receiving `out` as a nil.
```go
func encInBytes(out *[]byte) (in []byte) {
	in = *out
	if in == nil {
		in = make([]byte, defEncByteBufSize)
	}
	return
}
```
The solution is to revert the code and remove `encoder.ResetBytes(nil)` since it never did anything.
@denopink denopink added this to In progress in XMiDT via automation Jul 28, 2022
@denopink denopink self-assigned this Jul 28, 2022
@codecov
Copy link

codecov bot commented Jul 28, 2022

Codecov Report

Merging #613 (1c81890) into main (9105cce) will decrease coverage by 0.09%.
The diff coverage is 25.00%.

@@            Coverage Diff             @@
##             main     #613      +/-   ##
==========================================
- Coverage   82.68%   82.59%   -0.10%     
==========================================
  Files         184      184              
  Lines       10338    10340       +2     
==========================================
- Hits         8548     8540       -8     
- Misses       1580     1591      +11     
+ Partials      210      209       -1     
Flag Coverage Δ
unittests 82.59% <25.00%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
device/manager.go 61.49% <25.00%> (-0.34%) ⬇️
service/consul/instancer.go 65.51% <0.00%> (-5.52%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@denopink denopink moved this from In progress to PRs: Pending approval in XMiDT Jul 28, 2022
@sonarcloud
Copy link

sonarcloud bot commented Jul 29, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link
Member

@renaz6 renaz6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this done and researching so thoroughly!

XMiDT automation moved this from PRs: Pending approval to PRs: Approved Aug 18, 2022
@sonarcloud
Copy link

sonarcloud bot commented Aug 29, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@sonarcloud
Copy link

sonarcloud bot commented Sep 22, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

25.0% 25.0% Coverage
0.0% 0.0% Duplication

@denopink denopink merged commit 28bb7f5 into main Sep 22, 2022
XMiDT automation moved this from PRs: Approved to Done Sep 22, 2022
@denopink denopink deleted the denopink/feature/improve-writePump-performance branch September 22, 2022 21:01
@denopink denopink mentioned this pull request Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
XMiDT
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants