Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http2 protocol violation #633

Closed
erebe opened this issue Aug 12, 2022 · 11 comments
Closed

Http2 protocol violation #633

erebe opened this issue Aug 12, 2022 · 11 comments

Comments

@erebe
Copy link
Contributor

erebe commented Aug 12, 2022

Hello,

I am reporting an issue I have encountered while using tonic grpc server (v0.8.0).
We use nginx as a GRPC reverse proxy and hit the following corner case.
Full details here: https://trac.nginx.org/nginx/ticket/2376 you can also find a pcap capture of the issue in the ticket

With this setup
[client] === grpc stream (with ssl) ===> [nginx] === grpc stream (cleartext) ===> [backend]

and considering the following grpc service description

service Agent {
     rpc StreamToServer(stream Empty {}) returns (Empty) {}
}

If the http2 server generated by tonic early respond to the client without consuming fully the input stream

impl Agent {
    async fn stream_to_server (
        &self,
        request: Request<Streaming<Empty>>,
    ) -> Result<Response<Empty>, Status> {
        let _ = request.into_inner();
        // we don't care of the request at all
        // Just sending back our response before end of client stream
        Ok(Response::new(Empty {}))
    }
}

The following packets are going to be emitted by tonic/hyper/h2, in response to the call.

HEADERS[1],  DATA[1] (GRPC empty data),  HEADERS[1] (TRAILING with flag end_stream) , RST_STREAM(error: CANCEL)[1]

This specific sequence of packet is causing nginx to miss-behave and not forward the DATA & RST_STREAM back to the client.
After discussion with a maintainer of nginx, this is caused by the last RST_STREAM(error: CANCEL) which is invalid in regard of the spec, it should be a RST_STREAM(NO_ERROR)

As per the RFC

   ... A server can
   send a complete response prior to the client sending an entire
   request if the response does not depend on any portion of the request
   that has not been sent and received.  When this is true, a server MAY
   request that the client abort transmission of a request without error
   by sending a RST_STREAM with an error code of NO_ERROR after sending
   a complete response (i.e., a frame with the END_STREAM flag).
   Clients MUST NOT discard responses as a result of receiving such a
   RST_STREAM, though clients can always discard responses at their
   discretion for other reasons.

I tracked down where is coming this RST_STREAM(CANCEL), at first I thought it was in tonic but it ended-up being in the impl Drop of h2

h2::proto::streams::streams::maybe_cancel streams.rs:1467
h2::proto::streams::streams::drop_stream_ref::{closure#0} streams.rs:1443
h2::proto::streams::counts::Counts::transition<h2::proto::streams::streams::drop_stream_ref::{closure_env#0}, ()> counts.rs:127
h2::proto::streams::streams::drop_stream_ref streams.rs:1442
h2::proto::streams::streams::{impl#12}::drop streams.rs:1403

.schedule_implicit_reset(stream, Reason::CANCEL, counts, &mut actions.task);

So it seems a generic way of handling the end of stream for h2.
And now I am stuck in my investigation and require some guidance regarding where to go from here.

  • Replacing the Reason::CANCEL by Reason::NO_ERROR solves my issue with nginx (tested locally) and seem more adequate in regard of the RFC, but I don't know if there is un-expected side effect to that for other cases
  • I don't know if it is tonic or hyper that miss use h2 and require somewhere some special handling for this scenario

We really like tonic and nginx and would appreciate if we can go forward to make both happy to work together

@seanmonstar
Copy link
Member

Yes, I believe we've discussed before that there are cases that this should send NO_ERROR instead of CANCEL. Probably only in this specific case:

The full response has been sent (or scheduled to send), and any option DATA and trailers. Basically, that an end-of-stream frame is going out.

I think CANCEL is still correct in the general case, where the user has dropped all handles but never triggered end-of-stream, and also if the client drops handle waiting for the response.

@erebe
Copy link
Contributor Author

erebe commented Aug 12, 2022

Hello, thanks for the answer !

So I am at the right place 🎆
Is changing the maybe_cancel code by something like this would feel right to you ?
Or is there some other cases to handle ?

fn maybe_cancel(stream: &mut store::Ptr, actions: &mut Actions, counts: &mut Counts) {
    if stream.is_canceled_interest() {
+        // To match HalfClosedLocal(Streaming) state
+       let reason = if stream.state.is_send_closed() && stream.state.is_recv_streaming() {
+            Reason::NO_ERROR
+        } else {
+           Reason::CANCEL
+        };

        actions
            .send
            .schedule_implicit_reset(stream, reason, counts, &mut actions.task);
        actions.recv.enqueue_reset_expiration(stream, counts);
    }
}

@seanmonstar
Copy link
Member

Ummmm, hard for me to remember. I imagine if you changed that function without the conditional, some unit test would complain? That might show which unit test needs to be updated, or maybe we dont have this specific case as a test...

@erebe
Copy link
Contributor Author

erebe commented Aug 12, 2022

Ok, so I am impressed by the quality of your tests ...
Would dream to have those mock in my projects.

Regarding the issue, by adding an is_server() check in the condition and adapting 1 test that seems to match my case.
Everything is green with a cargo test --all-features --all

diff --git a/src/proto/streams/streams.rs b/src/proto/streams/streams.rs
index f11ee08..bfce840 100644
--- a/src/proto/streams/streams.rs
+++ b/src/proto/streams/streams.rs
@@ -1461,9 +1461,16 @@ fn drop_stream_ref(inner: &Mutex<Inner>, key: store::Key) {

 fn maybe_cancel(stream: &mut store::Ptr, actions: &mut Actions, counts: &mut Counts) {
     if stream.is_canceled_interest() {
+        // To match HalfClosedLocal(Streaming) state
+        let reason = if counts.peer().is_server() && stream.state.is_send_closed() && stream.state.is_recv_streaming() {
+            Reason::NO_ERROR
+        } else {
+            Reason::CANCEL
+        };
+
         actions
             .send
-            .schedule_implicit_reset(stream, Reason::CANCEL, counts, &mut actions.task);
+            .schedule_implicit_reset(stream, reason, counts, &mut actions.task);
         actions.recv.enqueue_reset_expiration(stream, counts);
     }
 }
diff --git a/tests/h2-tests/tests/server.rs b/tests/h2-tests/tests/server.rs
index b3bf1a2..e61fd3b 100644
--- a/tests/h2-tests/tests/server.rs
+++ b/tests/h2-tests/tests/server.rs
@@ -566,7 +566,7 @@ async fn sends_reset_cancel_when_req_body_is_dropped() {
         client
             .recv_frame(frames::headers(1).response(200).eos())
             .await;
-        client.recv_frame(frames::reset(1).cancel()).await;
+        client.recv_frame(frames::reset(1).reason(Reason::NO_ERROR)).await;
     };

     let srv = async move {

If this seems good to you, I can go forward and make a PR next week.
If you require some other things before, let me know

@seanmonstar
Copy link
Member

Nice, looks good to me!

erebe added a commit to erebe/h2 that referenced this issue Aug 13, 2022
  Http2 Server are allowed to early respond without fully
  consuming client input stream, but must respond with an
  error code of NO_ERROR when sending RST_STREAM.
  Nginx treat any other error code as fatal if not done so

  Commit change error code from CANCEL to NO_ERROR, when the
  server is early responding to the client

  hyperium#633
  https://trac.nginx.org/nginx/ticket/2376
@erebe
Copy link
Contributor Author

erebe commented Aug 13, 2022

Here we are #634 :)
It needs some approval to run the github actions

seanmonstar pushed a commit that referenced this issue Aug 15, 2022
Http2 Server are allowed to early respond without fully
  consuming client input stream, but must respond with an
  error code of NO_ERROR when sending RST_STREAM.
  Nginx treat any other error code as fatal if not done so

  Commit change error code from CANCEL to NO_ERROR, when the
  server is early responding to the client

  #633
  https://trac.nginx.org/nginx/ticket/2376
@erebe
Copy link
Contributor Author

erebe commented Aug 16, 2022

Thank you for the kind words and your reactivity ^^

By any chances, do you plan to make a new release soonish ?

@LucioFranco
Copy link
Member

This feels related to hyperium/tonic#992, right?

@erebe
Copy link
Contributor Author

erebe commented Aug 16, 2022

Hello @LucioFranco
Most likely yes, I haven't tried with a python client, but the issue you describe is the one I had with nginx.
Only RST_STREAM(NO_ERROR) is allowed on early server response.

I was planning to open an issue in tonic to ask to bump h2 version, once a new release is made, but seems you are ahead of me

@erebe
Copy link
Contributor Author

erebe commented Aug 17, 2022

Thank you @seanmonstar and @LucioFranco , I am closing the issue as it is resolved and released 👍

@erebe erebe closed this as completed Aug 17, 2022
@seanmonstar
Copy link
Member

Thanks again for the excellent work! If you're ever interested in helping on other aspects of h2, we'd welcome your work gladly! <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants