transport: stop always closing connections when loopy returns #6110

dfawley · 2023-03-09T23:24:35Z

As discovered in #6019, closing a connection upon encountering an I/O error can result in loss of data that was sent before the connection was closed. We previously believed that any data in the TCP connection would still be available to the reader indefinitely, but it appears that is not the case. This change makes it so we only close the connection from the loopy writer in non-I/O error situations. If an I/O error causes the loopy writer to exit, we don't need to close the connection, as the reader goroutine will also encounter an I/O error once it is done consuming all the data.

RELEASE NOTES:

transport: do not close connections when we encounter I/O errors until after all data is consumed

internal/transport/controlbuf.go

easwars · 2023-03-11T01:58:10Z

internal/transport/controlbuf.go

 	for {
 		it, err := l.cbuf.get(true)
 		if err != nil {
+			l.closeConnection()
 			return err
 		}
 		if err = l.handle(it); err != nil {


handle() returns an error only when it encounters an unknown control message type. Is this an I/O error? Shouldn't we close the connection here? Same applies to the handle() call down below.

Also, I see that processData() today returns an error only when writing of data or headers fails. But how can we guarantee that in the future? Should we at least document that handle() and processData() should return errors only for I/O related events. And also document loopy.run() saying it will close the connection only when it sees a non-I/O error.

handle returns errors from the handlers themselves which has a lot of I/O error possibilities. We'll need to do that closing in handle itself. Or we could make the errors carry a type with a bool to indicate whether they are I/O errors, but that felt messier.

It seems there's two ways to do this. Commit 1 is wrapping in a lot of different places which feels finnicky, and commit 2 is wrapping in the writer which might not work if the http2 framer decides to start wrapping errors without supporting Unwrap (but that seems very unlikely and we could deal with it if it ever happens).

I prefer the ioError option as well, and the PR looks good to me.

easwars · 2023-03-11T02:12:30Z

internal/transport/controlbuf.go

@@ -846,7 +856,8 @@ func (l *loopyWriter) handle(i interface{}) error {
 	case *outFlowControlSizeRequest:
 		l.outFlowControlSizeRequestHandler(i)
 	case closeConnection:
-		return l.closeConnectionHandler()
+		l.closeConnection()


The only current usage of the closeConnection type is from the server-side keepalive code (when the grace period expires). Why do we need this separate type? Why can't we instead simply close the transport and let is call Close() on the underlying connection. Is this intended to be way to close the connection after completing all pending tasks in the controlbuf, while closing the transport will immediately close the underlying connection without completing pending tasks?

Yes, we need to flush any pending writes that may have made it legal to close the transport at this time (vs waiting longer for streams to finish). This was added recently: #5821

transport: stop always closing connections when loopy returns

5239071

dfawley added the Type: Bug label Mar 9, 2023

dfawley added this to the 1.54 Release milestone Mar 9, 2023

dfawley requested a review from easwars March 9, 2023 23:24

dfawley assigned easwars Mar 9, 2023

dfawley mentioned this pull request Mar 10, 2023

Why gRPC Server closes a connection at the 2nd GOAWAY. #6019

Closed

easwars reviewed Mar 10, 2023

View reviewed changes

internal/transport/controlbuf.go Show resolved Hide resolved

easwars reviewed Mar 11, 2023

View reviewed changes

easwars assigned dfawley and unassigned easwars Mar 11, 2023

dfawley added 2 commits March 13, 2023 09:47

toIOError

02c93a9

toIOError in writer

19a6c0e

dfawley assigned easwars and unassigned dfawley Mar 13, 2023

typo

ddc5d8d

easwars approved these changes Mar 13, 2023

View reviewed changes

easwars assigned dfawley and unassigned easwars Mar 13, 2023

dfawley merged commit b458a4f into grpc:master Mar 14, 2023
10 checks passed

dfawley deleted the cbufcleanup branch March 14, 2023 20:32

hiyosi mentioned this pull request Mar 17, 2023

I/O error when the server closes a connection #6127

Closed

crazy-max mentioned this pull request Jul 28, 2023

update grpc/protobuf dependencies and protoc moby/buildkit#4074

Closed

github-actions bot locked as resolved and limited conversation to collaborators Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transport: stop always closing connections when loopy returns #6110

transport: stop always closing connections when loopy returns #6110

dfawley commented Mar 9, 2023

easwars Mar 11, 2023

easwars Mar 11, 2023

dfawley Mar 13, 2023

easwars Mar 13, 2023

easwars Mar 11, 2023

dfawley Mar 13, 2023

transport: stop always closing connections when loopy returns #6110

transport: stop always closing connections when loopy returns #6110

Conversation

dfawley commented Mar 9, 2023

easwars Mar 11, 2023

Choose a reason for hiding this comment

easwars Mar 11, 2023

Choose a reason for hiding this comment

dfawley Mar 13, 2023

Choose a reason for hiding this comment

easwars Mar 13, 2023

Choose a reason for hiding this comment

easwars Mar 11, 2023

Choose a reason for hiding this comment

dfawley Mar 13, 2023

Choose a reason for hiding this comment