Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ssl] Fix SSL stack to handle large handshake messages whose length exceeds the BIO buffer size. #33638

Merged
merged 9 commits into from
Jul 18, 2023

Conversation

matthewstevenson88
Copy link
Contributor

@matthewstevenson88 matthewstevenson88 commented Jul 7, 2023

There is a bug in the SSL stack that was only partially fixed in #29176: if more than 17kb is written to the BIO buffer, then everything over 17kb will be discarded, and the SSL handshake will fail with a bad record mac error or hang if not enough bytes have arrived yet.

It's relatively uncommon to hit this bug, because the TLS handshake messages need to be much larger than normal for you to have a chance of hitting this bug. However, there was a separate bug in the SSL stack (recently fixed in #33558) that causes the ServerHello produced by a gRPC-C++ TLS server to grow linearly in size with the size of the trust bundle; these 2 bugs combined to cause a large number of TLS handshake failures for gRPC-C++ clients talking to gRPC-C++ servers when the server had a large trust bundle.

This PR fixes the bug by ensuring that all bytes are successfully written to the BIO buffer. An initial quick fix for this bug was planned in #33611, but abandoned because we were worried about temporarily doubling the memory footprint of all SSL channels.

The complexity in this PR is mostly in the test: it is fairly tricky to force gRPC-C++'s SSL stack to generate a sufficiently large ServerHello to trigger this bug.

@matthewstevenson88 matthewstevenson88 self-assigned this Jul 7, 2023
@matthewstevenson88 matthewstevenson88 changed the title Initial fix for BAD_DECRYPT error. [ssl] Fix SSL stack to handle large handshake messages whose length exceeds the BIO buffer size. Jul 7, 2023
@matthewstevenson88 matthewstevenson88 added release notes: yes Indicates if PR needs to be in release notes kind/bug area/security and removed PR title format labels Jul 7, 2023
@matthewstevenson88 matthewstevenson88 marked this pull request as ready for review July 10, 2023 03:40
@matthewstevenson88
Copy link
Contributor Author

@zeromath Could you also take a quick look at this PR?

@zeromath
Copy link
Contributor

LGTM!

(I didn't find a button for this, maybe it's because I don't have the permission to do so.)

@matthewstevenson88 matthewstevenson88 removed their assignment Jul 10, 2023
@matthewstevenson88
Copy link
Contributor Author

@markdroth Friendly ping. :)

src/core/tsi/ssl_transport_security.cc Outdated Show resolved Hide resolved
src/core/tsi/ssl_transport_security.cc Outdated Show resolved Hide resolved
src/core/tsi/ssl_transport_security.cc Outdated Show resolved Hide resolved
src/core/tsi/ssl_transport_security.cc Outdated Show resolved Hide resolved
@matthewstevenson88
Copy link
Contributor Author

Thanks all for the reviews!

@matthewstevenson88 matthewstevenson88 merged commit fae2982 into grpc:master Jul 18, 2023
62 of 65 checks passed
@copybara-service copybara-service bot added the imported Specifies if the PR has been imported to the internal repository label Jul 19, 2023
matthewstevenson88 added a commit that referenced this pull request Jul 24, 2023
…3824)

This PR is expected to fix the flakes of
`//test/core/tsi:ssl_transport_security_test` when built under UBSAN.

Why is this needed? There are several tests in
`ssl_transport_security_test.cc` that involve doing many expensive
operations and PR #33638 recently added one more (namely, repeatedly
signing with an ECDSA key). The slow tests are already altered for MSAN
and TSAN, and now we need to do the same for UBSAN.
mario-vimal pushed a commit to mario-vimal/grpc that referenced this pull request Jul 27, 2023
…xceeds the BIO buffer size. (grpc#33638)

There is a bug in the SSL stack that was only partially fixed in grpc#29176:
if more than 17kb is written to the BIO buffer, then everything over
17kb will be discarded, and the SSL handshake will fail with a bad
record mac error or hang if not enough bytes have arrived yet.

It's relatively uncommon to hit this bug, because the TLS handshake
messages need to be much larger than normal for you to have a chance of
hitting this bug. However, there was a separate bug in the SSL stack
(recently fixed in grpc#33558) that causes the ServerHello produced by a
gRPC-C++ TLS server to grow linearly in size with the size of the trust
bundle; these 2 bugs combined to cause a large number of TLS handshake
failures for gRPC-C++ clients talking to gRPC-C++ servers when the
server had a large trust bundle.

This PR fixes the bug by ensuring that all bytes are successfully
written to the BIO buffer. An initial quick fix for this bug was planned
in grpc#33611, but abandoned because we were worried about temporarily
doubling the memory footprint of all SSL channels.

The complexity in this PR is mostly in the test: it is fairly tricky to
force gRPC-C++'s SSL stack to generate a sufficiently large ServerHello
to trigger this bug.
mario-vimal pushed a commit to mario-vimal/grpc that referenced this pull request Jul 27, 2023
…pc#33824)

This PR is expected to fix the flakes of
`//test/core/tsi:ssl_transport_security_test` when built under UBSAN.

Why is this needed? There are several tests in
`ssl_transport_security_test.cc` that involve doing many expensive
operations and PR grpc#33638 recently added one more (namely, repeatedly
signing with an ECDSA key). The slow tests are already altered for MSAN
and TSAN, and now we need to do the same for UBSAN.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security bloat/none imported Specifies if the PR has been imported to the internal repository kind/bug lang/core per-call-memory/neutral per-channel-memory/neutral release notes: yes Indicates if PR needs to be in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants