Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jwe/CompactSerialize: improve performance. #67

Merged
merged 1 commit into from Nov 27, 2023
Merged

jwe/CompactSerialize: improve performance. #67

merged 1 commit into from Nov 27, 2023

Conversation

lasiar
Copy link

@lasiar lasiar commented Nov 8, 2023

Use concatenate instead of fmt.Sprintf(...). Reduce memory allocation.

on my machine: m1 pro, 32gb ram.

name                 old time/op    new time/op    delta
CompactSerialize-10    1.32µs ± 1%    1.12µs ± 1%  -15.40%  (p=0.000 n=19+19)

name                 old alloc/op   new alloc/op   delta
CompactSerialize-10    2.21kB ± 0%    2.13kB ± 0%   -3.62%  (p=0.000 n=20+20)

name                 old allocs/op  new allocs/op  delta
CompactSerialize-10      25.0 ± 0%      20.0 ± 0%  -20.00%  (p=0.000 n=20+20)

@jsha
Copy link
Collaborator

jsha commented Nov 15, 2023

This is a neat finding! Do you have an impression of why concatenation is faster than Sprintf in this case?

If string allocations are a key factor in the hot path here, I would expect that it would be faster still to do this:

  • calculate the encoded length of each component (base64 provides EncodedLen), plus the separator dot characters.
  • allocate a []byte of that length.
  • write the encoded values directly into that []byte using Encode.
  • also write the dots.

Would you be interested in benchmarking that variant as well?

Also it looks like we follow the same pattern for JSONWebSignature.CompactSerialize. Whichever approach we wind up with here, would you apply it there too?

@lasiar
Copy link
Author

lasiar commented Nov 16, 2023

Hi @jsha, thank you for feedback!

About string formatter, here is an explanation.

About use preallocating slice.
Yes, this is faster on the same machine (m1 pro, 32gb ram):

name                 time/op
CompactSerialize-10   848ns ±11%

name                 alloc/op
CompactSerialize-10  1.63kB ± 0%

name                 allocs/op
CompactSerialize-10    11.0 ± 0% 

But the code is more unclear:

example

encoding.go:

func base64URLEncodeLen(sl []byte) int {
	return base64.RawURLEncoding.EncodedLen(len(sl))
}

// base64URLEncodeAppend appends base64-encoded src to dst and returns the dst.
func base64URLEncodeAppend(dst, src []byte) []byte {
	old := len(dst)
	need := base64URLEncodeLen(src)
	end := old + need
	for cap(dst)-len(dst) < need {
		dst = append(dst[:cap(dst)], 0)
	}

	base64.RawURLEncoding.Encode(dst[old:end], src)
	return dst[:end]
}

jwe.go:

// CompactSerialize serializes an object using the compact serialization format.
func (obj JSONWebEncryption) CompactSerialize() (string, error) {
	if len(obj.recipients) != 1 || obj.unprotected != nil ||
		obj.protected == nil || obj.recipients[0].header != nil {
		return "", ErrNotSupported
	}

	serializedProtected := mustSerializeJSON(obj.protected)

	const countOfDots = 4
	lenSerialized := base64URLEncodeLen(serializedProtected) +
		base64URLEncodeLen(obj.recipients[0].encryptedKey) +
		base64URLEncodeLen(obj.iv) +
		base64URLEncodeLen(obj.ciphertext) +
		base64URLEncodeLen(obj.tag) +
		countOfDots

	out := make([]byte, 0, lenSerialized)

	out = base64URLEncodeAppend(out, serializedProtected)
	out = append(out, '.')

	out = base64URLEncodeAppend(out, obj.recipients[0].encryptedKey)
	out = append(out, '.')

	out = base64URLEncodeAppend(out, obj.iv)
	out = append(out, '.')

	out = base64URLEncodeAppend(out, obj.ciphertext)
	out = append(out, '.')

	out = base64URLEncodeAppend(out, obj.tag)

	return string(out), nil
}

jws.go

func (obj JSONWebSignature) compactSerialize(detached bool) (string, error) {
	if len(obj.Signatures) != 1 || obj.Signatures[0].header != nil || obj.Signatures[0].protected == nil {
		return "", ErrNotSupported
	}

	serializedProtected := mustSerializeJSON(obj.Signatures[0].protected)
	const countOfDots = 3

	maxLenSerialized := base64URLEncodeLen(serializedProtected) +
		base64URLEncodeLen(obj.payload) +
		base64URLEncodeLen(obj.Signatures[0].Signature) +
		countOfDots

	serialized := make([]byte, 0, maxLenSerialized)

	serialized = base64URLEncodeAppend(serialized, serializedProtected)
	serialized = append(serialized, '.')

	if !detached {
		serialized = base64URLEncodeAppend(serialized, obj.payload)
	}
	serialized = append(serialized, '.')

	serialized = base64URLEncodeAppend(serialized, obj.Signatures[0].Signature)

	return string(serialized), nil
}

If you accept these changes, I can push.

@jsha
Copy link
Collaborator

jsha commented Nov 17, 2023

Thanks for exploring the alternate approach. That's a nice significant speedup.

It seems like we can make it a little clearer by factoring out this logic into joinWithDots(inputs... []byte), which would work for any number of inputs and could be used by both call sites.

Also, I like the safety of your approach where the slice is made with cap N and len 0, and appended to. But this feels a little awkward:

	for cap(dst)-len(dst) < need {
		dst = append(dst[:cap(dst)], 0)
	}

I think we would be better off initializing the slice with cap N and len N, and keeping track of the actually-initialized part ourselves. That might also save a little on that loop, depending on how the optimizer handles it.

Use concatenate instead of fmt.Sprintf(...). Reduce memory allocation.
@lasiar
Copy link
Author

lasiar commented Nov 25, 2023

@jsha, done

Copy link
Collaborator

@jsha jsha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks!

@jsha jsha merged commit aa386df into go-jose:v3 Nov 27, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants