-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle ClientDisconnect errors raicsed when reading post body, don't log geo warnings on ip not in db #107
Conversation
…log geo warnings on ip not in db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main question though, is why are we receiving these ClientDIsconnects? From the doc it sounds like it happens during long polling or streaming responses, but that doesn't seem to be our case, so I am not sure why we are experiencing disconnects.
These ones are occurring while reading the post body, so I think it would be cancelled requests. I don't know more specific than basing that on the call stack though (like not sure what scenarios in Firefox would cause them). |
… framework for disconnected response
Hmm, my assumption right now is some requests are reading the request body, and then a subsequent call in the proxy server hangs on it's await to request.json(), since you can only read the request body once on a request. Though, I cannot find what would be making that first call to read the body. It could be the sentry python sdk potentially here: https://github.com/getsentry/sentry-python/blob/1.39.1/sentry_sdk/integrations/starlette.py#L578. I am surprised we are not seeing the same error in AWS, but I did see that some requests appear to have been taking 60 seconds in AWS though. Either way it shouldn't be an issue when we move to mars if it is a python request body issue. |
The starlette requests package would throw a different error if it tried to read the body twice https://github.com/encode/starlette/blob/master/starlette/requests.py#L221-L247 -- it also sets attributes after parsing the body and json so that a subsequent call would return the previously parsed result. I wonder if there is an upper limit on request duration that differs from our settings in gcp compared to aws if you saw requests that were a minute long. |
The PR that implemented that caching in starlette though says there are still scenarios when the request can hang: encode/starlette#1692 (comment) From the PR: I have to assume those are the scenarios we are hitting to cause this error. Especially since I believe sentry was added to GCP but not AWS, and I did not see these errors in AWS. |
Goal
cloud logs
Implementation Decisions
I tried reading FastAPI docs on what is best to do with ClientDisconnect exceptions to handle them and couldn't find any concrete direction. Mainly they would comment that you could detect the disconnect with
await request.is_disconnected()
starlette docs Since the client isn't waiting for a response, I wasn't sure which response makes sense for the handler to return.The ips not existing in the db can be filtered in google logs UI, but it also seemed like something that isn't really actionable so it felt like extra noise.
Testing
pipenv run pytest tests/api
pipenv run pytest tests/unit
All Submissions: