handle ClientDisconnect errors raicsed when reading post body, don't log geo warnings on ip not in db #107

dmueller · 2024-02-08T17:45:22Z

Goal

I believe that ClientDisconnect exceptions getting raised from reading the post body and the client disconnecting are showing up as 500s on our graphs. We should handle disconnects gracefully since the client is no longer waiting for a response. starlette requests.py
Looking at our logs in google is difficult because of the volume of noise. 40%+ of the logs are coming from ips not existing in our geo db.

Implementation Decisions

I tried reading FastAPI docs on what is best to do with ClientDisconnect exceptions to handle them and couldn't find any concrete direction. Mainly they would comment that you could detect the disconnect with await request.is_disconnected() starlette docs Since the client isn't waiting for a response, I wasn't sure which response makes sense for the handler to return.

The ips not existing in the db can be filtered in google logs UI, but it also seemed like something that isn't really actionable so it felt like extra noise.

Testing

pipenv run pytest tests/api
pipenv run pytest tests/unit

All Submissions:

Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same update/change?

…log geo warnings on ip not in db

app/client.py

app/main.py

bbirdsong2

My main question though, is why are we receiving these ClientDIsconnects? From the doc it sounds like it happens during long polling or streaming responses, but that doesn't seem to be our case, so I am not sure why we are experiencing disconnects.

dmueller · 2024-02-08T19:41:38Z

My main question though, is why are we receiving these ClientDIsconnects? From the doc it sounds like it happens during long polling or streaming responses, but that doesn't seem to be our case, so I am not sure why we are experiencing disconnects.

These ones are occurring while reading the post body, so I think it would be cancelled requests. I don't know more specific than basing that on the call stack though (like not sure what scenarios in Firefox would cause them).

… framework for disconnected response

bbirdsong2 · 2024-02-09T01:11:58Z

Hmm, my assumption right now is some requests are reading the request body, and then a subsequent call in the proxy server hangs on it's await to request.json(), since you can only read the request body once on a request. Though, I cannot find what would be making that first call to read the body. It could be the sentry python sdk potentially here: https://github.com/getsentry/sentry-python/blob/1.39.1/sentry_sdk/integrations/starlette.py#L578.

I am surprised we are not seeing the same error in AWS, but I did see that some requests appear to have been taking 60 seconds in AWS though.

Either way it shouldn't be an issue when we move to mars if it is a python request body issue.

dmueller · 2024-02-09T14:20:26Z

The starlette requests package would throw a different error if it tried to read the body twice https://github.com/encode/starlette/blob/master/starlette/requests.py#L221-L247 -- it also sets attributes after parsing the body and json so that a subsequent call would return the previously parsed result.

I wonder if there is an upper limit on request duration that differs from our settings in gcp compared to aws if you saw requests that were a minute long.

bbirdsong2 · 2024-02-09T20:21:58Z

The PR that implemented that caching in starlette though says there are still scenarios when the request can hang: encode/starlette#1692 (comment)

From the PR:
The use cases this doesn't fix (reading the request body for the second time in an error handler or a middleware) are also not use cases you can get around using pure ASGI middleware.

I have to assume those are the scenarios we are hitting to cause this error. Especially since I believe sentry was added to GCP but not AWS, and I did not see these errors in AWS.

handle ClientDisconnect errors raicsed when reading post body, don't …

21e3303

…log geo warnings on ip not in db

dmueller marked this pull request as ready for review February 8, 2024 17:57

dmueller requested a review from a team as a code owner February 8, 2024 17:57

cbguder approved these changes Feb 8, 2024

View reviewed changes

app/client.py Outdated Show resolved Hide resolved

app/main.py Outdated Show resolved Hide resolved

bbirdsong2 reviewed Feb 8, 2024

View reviewed changes

review feedback, delete commented out line, use default response from…

5c859c8

… framework for disconnected response

dmueller merged commit 3df5a45 into main Feb 8, 2024
4 checks passed

dmueller deleted the dmueller/handle-client-disconnects branch February 8, 2024 20:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

handle ClientDisconnect errors raicsed when reading post body, don't log geo warnings on ip not in db #107

handle ClientDisconnect errors raicsed when reading post body, don't log geo warnings on ip not in db #107

dmueller commented Feb 8, 2024 •

edited

bbirdsong2 left a comment

dmueller commented Feb 8, 2024

bbirdsong2 commented Feb 9, 2024 •

edited

dmueller commented Feb 9, 2024

bbirdsong2 commented Feb 9, 2024

handle ClientDisconnect errors raicsed when reading post body, don't log geo warnings on ip not in db #107

handle ClientDisconnect errors raicsed when reading post body, don't log geo warnings on ip not in db #107

Conversation

dmueller commented Feb 8, 2024 • edited

Goal

Implementation Decisions

Testing

All Submissions:

bbirdsong2 left a comment

Choose a reason for hiding this comment

dmueller commented Feb 8, 2024

bbirdsong2 commented Feb 9, 2024 • edited

dmueller commented Feb 9, 2024

bbirdsong2 commented Feb 9, 2024

dmueller commented Feb 8, 2024 •

edited

bbirdsong2 commented Feb 9, 2024 •

edited