Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Do nil headers from stream.Header() indicate an error? #6524

Closed
iamnoah opened this issue Aug 9, 2023 · 12 comments · Fixed by #6557 or #6586
Closed

docs: Do nil headers from stream.Header() indicate an error? #6524

iamnoah opened this issue Aug 9, 2023 · 12 comments · Fixed by #6557 or #6586

Comments

@iamnoah
Copy link

iamnoah commented Aug 9, 2023

I have an rpc Test(TestRequest) returns (stream TestMessage) with an implementation that just returns an error immediately.

Then the client does this:

	stream, err := testClient.Test(ctx, &TestRequest{})
	if err != nil {...}    // don't expect an error here
	header, err := stream.Header()
        // but this waited for headers... the error headers were received... where did they go?
	if err != nil {...}  

What is flummoxing me is that most of the time, but not always, stream.Header() returns nil, nil. Occasionally, running in a debugger, it returns the error. A calls to stream.Recv() returns the error reliably, but IDK if there is an error or not, and could end up blocking indefinitely waiting for it, which isn’t good because I have an upstream waiting to know the stream was setup successfully.

What I noticed is that if there was an error, I get nil headers instead of content-type: .... Can I reliably use this as a signal to invoke Recv() and get the underlying error?

Why can't Header() return the error? Inspection in the debugger shows that it reads the error code and message headers out of the frame, but short-circuits because the stream has terminated and internal assigns noHeaders. I just want access to the state that is buried in there (there was an error) without having to resort to trying to recv with a timeout.

@arvindbr8
Copy link
Member

Hi,

It is not safe to assume a nil, nil returned from s.Header() indicates an error. The error that is being sent by the Test RPC that is implemented can only be read by calling stream.Recv(). You might have to call stream.Recv() multiple times if there were other messages sent on the stream before the error message.

This particular scenario seems like you want to implement a proxy. In which case, if testClient.Test doesnt return an err, you can safely assume that you can call Send and Recv on the stream and proxy the data upstream.

A calls to stream.Recv() returns the error reliably, but IDK if there is an error or not, and could end up blocking indefinitely waiting for it,

the blocking read is totally depended on the application's implementation of BiDi streaming. The client could be blocking Recv() if the server has never sent anything on the stream. But in the case of the server explicitly returning an error and closing the stream, you can totally rely on Recv() to return that error and relay that to upstream.

Let us know if this answers your question.

@iamnoah
Copy link
Author

iamnoah commented Aug 10, 2023

server explicitly returning an error and closing the stream, you can totally rely on Recv() to return that error and relay that to upstream.

My situation is that the server will process the request message and either:

  1. Return an error (due to a problem with the request.)
  2. Request is OK. I will then call serverStream.SendHeader(...) and then maybe much later start sending messages and possibly errors.

My problem is that there is no way to definitely detect this error returned before any messages without potentially blocking to wait for a message that may never arrive. This seems defective to me, as the protocol and underlying clientStream knows there is an error, it has already parsed the trailers response containing the error, but instead of returning it, it hides it until I make a call that could possibly block.

IMO clientStream.Header() should return the error it received while waiting for headers. That seems like the least surprising interpretation of calling something that waits for headers in response to not receiving headers. Reveal the state of the stream to me when it is known.

But I can see how a strict interpretation is that it is a method for returning headers and not trailers, so returning nil to indicate there were no headers makes a kind sense. If the documentation stated that nil, nil indicates the stream was terminated without sending headers (e.g., a trailers only response, not necessarily an error) that would be useful. Then I would know Recv() would return an error (either the status or EOF.)

@arvindbr8
Copy link
Member

Hi, I'm not sure if I understand the problem here.

My problem is that there is no way to definitely detect this error returned before any messages without potentially blocking to wait for a message that may never arrive.

Server returns an error for an invalid request sent on the stream by the client. Does that mean the client sends the first message on the stream? Wouldnt the client also call recv() on the stream after sending a request? - which should return the error from the server?

In gRPC, calling stream, err := testClient.Test(ctx, &TestRequest{}) doesnt necessarily create an underlying transport. Most of which is totally depended on the application semantics. In BiDi, this could either be the server or the client sending the first message on the stream.

I'm not convinced there is a case where the client relies on stream.Header() to detect error from server rather than relying on the error returned from recv().

@iamnoah
Copy link
Author

iamnoah commented Aug 14, 2023

In gRPC, calling stream, err := testClient.Test(ctx, &TestRequest{}) doesnt necessarily create an underlying transport

Yes, but there is a unique underlying (HTTP2) stream that is in a definitive state. In my case, after waiting for headers, the stream has terminated with a trailers-only response including an error. You are correct the HTTP2 connection may still be alive, but the stream on that connection is in a known terminal state. I am talking the HTTP2 stream whose state corresponds to the state of the gRPC stream, not the transport connection.

I'm not convinced there is a case where the client relies on stream.Header() to detect error from server rather than relying on the error returned from recv().

The problem is timeouts. recv() only returns an error quickly if there is an error, else it can block until the server decides to send something.

Here is my case: the semantics of my unary->stream call are that the unary message is either accepted for establishing a stream or rejected. The gRPC client is an HTTP server making the call as part of handling an upstream HTTP request. If the gRPC request is accepted, the gRPC server indicates this by sending the headers. So the server is explicitly indicating the request was OK.

At this point, the gRPC client ought to be able report a status to the upstream HTTP client indicating the request was accepted. Except it can't, because we don't know if there was an error, we have to recv() to check that.

If there was no error: the gRPC client calls client.Recv() and might wait minutes or even hours for the server to send something. But I have an upstream HTTP client that needs a response status! I cannot wait minutes or hours to send status headers upstream! The HTTP request will time out, even though it may have been 100% successful.


You might suggest that I have a separate goroutine invoke client.Recv() and wait for that error within a small timeout. That is the best workaround I have found, but this is subject to race conditions. e.g., there is no guarantee the goroutine invoking client.Recv() executes before the timeout, and the timeout imposes a cost on every request.


When client.Header() returns, the gRPC/HTTP2 stream is in a definitive state. Either headers have been received and we are waiting for the first message, or the stream was terminated.

Experimentation leads me to believe that nil headers indicate the stream was terminated without sending headers. My problem is that this behavior is not documented. If I rely on nil headers as an indictor the stream terminated, you (the library authors) may decide later on that a server that sends no custom headers should return nil instead of just a content type or empty map, and I'm back to a successful request blocking indefinitely while my upstream times out.

Documentation that states that some current behavior (nil or even empty headers) indicates a stream was terminated would be sufficient to solve my problem. Then I can invoke Recv() for the error, knowing it will not block indefinitely, and send a correct response upstream. I'd prefer something more explicit, but I don't see a way to do that without changing the interface or returning an error where there was not one before.

@arvindbr8
Copy link
Member

If the gRPC request is accepted, the gRPC server indicates this by sending the headers. So the server is explicitly indicating the request was OK.

the problem is that it is not safe to assume a nil, nil returned from stream.Header() indicates an error from the server. Header() is never supposed to return an error. We cannot change the documentation because this is an implementation details and cannot be relied on.

In your particular case, here are 2 suggested modification you could try implementing

  1. Set a header explicitly on the server and send that back when the stream is created. You can rely on the header presence to deduce an error.
  2. Let the server respond immediately on the stream with a message. This way Recv() is not blocking indefinitely on the client. Recv() is a blocking call, but is totally depended on the server implementation on how long would it be blocking.

@iamnoah
Copy link
Author

iamnoah commented Aug 15, 2023

We cannot change the documentation because this is an implementation details and cannot be relied on.

OK. Then I think this is an implementation bug that when I wait for headers, I cannot learn whether headers were actually received or not. The server is sending a message: it is sending either headers or closing the stream. Just give me access to that information! That's all am I asking for. If the implementation does not provide a way to do that, can someone please explain how that is not a defect in the implementation?

@dfawley
Copy link
Member

dfawley commented Aug 15, 2023

The server is sending a message: it is sending either headers or closing the stream.

The server only sends headers when your method handler instructs it to (grpc.SendHeader for unary or ServerStream.SendHeader for streaming), or before it sends its first response message. If you aren't sending an explicit message, then you need to be doing one of those other two things to make sure the headers are sent.

I have an rpc Test(TestRequest) returns (stream TestMessage) with an implementation that just returns an error immediately.
...
A calls to stream.Recv() returns the error reliably
...
could end up blocking indefinitely waiting for it, which isn’t good because I have an upstream waiting to know the stream was setup successfully.

Why would it block indefinitely? If an error is always written to the stream, Recv will always return it immediately.

Can I reliably use this as a signal to invoke Recv() and get the underlying error?

If you really want to, but I wouldn't recommend it. Yes, content-type is always sent in the response headers. So if the client doesn't receive it, but Header() unblocks, then there must have been some kind of an error. But what are you trying to actually do here?

Why can't Header() return the error?

We only return RPC errors when Recv is called. We do this because, when an error occurs, in many cases there could be messages on the stream that can be read before the error was encountered. This gives you the opportunity to read them first.

@iamnoah
Copy link
Author

iamnoah commented Aug 16, 2023

@dfawley I started with an example that didn't make sense to me at the time (always returning an error produces no error from Header())

I now understand the reasoning for errors being returned by Recv(), which makes senes.

My problem is that I have a server (as described here) that will either: 1. return an error immediately (without sending messages or headers,) or 2. send headers and then at some (perhaps much) later time send a message. So headers are indication that the stream setup was successful.

The problem is that in the non-failure case (2), recv can block indefinitely, so I can't return status upstream in a timely way. I am very frustrated by this because if you look at the http2 stream after Header() returns, the state is known: it has either already received the error and the stream is terminated, or it received some kind of headers and is waiting for more frames.

As a user, I cannot access that state. I just want to know if I really received headers or not. The later means I can invoke Recv() without having to worry that it will block indefinitely. If it were documented that nil headers indicates the stream terminated without headers, that would solve my problem.

@dfawley
Copy link
Member

dfawley commented Aug 16, 2023

Upon closer inspection, it seems there is a bug here that was introduced with #5763. Before that change, the Header() function used to return the stream's status if an error occurred before receiving the headers or if the RPC was terminated by the server with a trailers-only response. I sent #6557 to fix that behavior.

With the fix, if ClientStream.Header() returns an error, then headers were not received from the server. As with RecvMsg, if the error is io.EOF then the RPC completed successfully. You should use the error returned and not use a nil check on the metadata to determine whether there was a problem.

@iamnoah
Copy link
Author

iamnoah commented Aug 17, 2023

@dfawley amazing! Thank you!

@dfawley
Copy link
Member

dfawley commented Aug 25, 2023

Before that change, the Header() function used to return the stream's status if an error occurred before receiving the headers or if the RPC was terminated by the server with a trailers-only response.

Hmm, actually it seems this was not correct. Even before that change, a trailers-only response would have returned nil for the error, and the user was expected to call RecvMsg to get the status.

Also, some of our users were depending on this behavior of Header() returning nil in this situation, so it might not be possible to fix it this way. And I've found a way to fix the retry problem without impacting the API layer in this way.

If we did this, we'd need to find another way to solve your concerns....

@dfawley
Copy link
Member

dfawley commented Aug 25, 2023

I think the most reasonable way to do this would be to document that nil metadata means there were no headers from the server, and that the stream status can be discovered by calling RecvMsg. That is what #6586 does.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants