Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinguish between truncated and excess content in response #3273

Merged
merged 8 commits into from Feb 12, 2024

Conversation

crazyscientist
Copy link
Contributor

HTTPResponse._raw_read raises IncompleteRead if the content length does not match the expected content length. For malformed responses (e.g. 204 response with content) the re-raised ProtocolError was a bit too unclear about the unexpected excess content being the reason for the exception.

With this change, the exception points out, that the client is not dealing with a connection error but a protocol violation.

Closes #3261

@crazyscientist crazyscientist force-pushed the 3261-misleading-incomplete-read branch 3 times, most recently from 64de020 to 27ed366 Compare January 10, 2024 08:26
illia-v
illia-v previously approved these changes Jan 22, 2024
Copy link
Member

@illia-v illia-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Member

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments to not hold up review from others:

and e.partial is not None
and int(e.expected) == -int(e.partial)
):
arg = f"Protocol violation: Response may not contain content: {e!r}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We don't need to mention "Protocol violation" instead ProtocolError should be visible in tracebacks. The phrasing might mention a "body" as well, content can be anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to come up with something that followed a similar structure as the other messages.

What do you think about this version?

Suggested change
arg = f"Protocol violation: Response may not contain content: {e!r}"
arg = f"Response may not contain a body: {e!r}"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about this version?

Looks good to me.

Maybe we should drop : {e!r} too.
In the traceback from #3261 (comment) the error was urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(16 bytes read, -16 more expected)', IncompleteRead(16 bytes read, -16 more expected)). I don't see why we need that IncompleteRead(16 bytes read, -16 more expected)) duplicated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant dropping : {e!r} just for the new Response may not contain a body

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thank you for the clarification. Pushing a new commit now. Please see 1d6c76d

if (
e.expected is not None
and e.partial is not None
and int(e.expected) == -int(e.partial)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe e.partial is meant to be bytes, not an integer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have noticed the int type annotation in the signature of IncompleteRead.__init__. But the nox linter assumes that it's bytes; without the explicit type cast I get the following complaint:

src/urllib3/response.py:755: error: Non-overlapping equality check (left operand type: "int", right operand type: "bytes")  [comparison-overlap]

Please advise me on your preferred strategy to address this issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd define types of IncompleteRead attributes this way:

class IncompleteRead(HTTPError, httplib_IncompleteRead):
    partial: int  # type: ignore[assignment]
    expected: int

This leads to removing a few type: ignore, and then works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes perfect sense to me 🙂

I had considered that, but it lead to a few test failures, too. So, I went for the explicit type conversion.

But, as you like that idea, let's see where it takes us 👍

Copy link
Contributor

@ecerulm ecerulm Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@illia-v , @crazyscientist

reveal_type(http.client.IncompleteRead.partial) gives note: Revealed type is "builtins.bytes",
reveal_type(http.client.IncompleteRead.expected) gives note: Revealed type is "Union[builtins.int, None]"

@crazyscientist to redefine urllib3.exceptions.IncompleteRead.partial you can follow @illia-v advice, I tried locally and it works:

class IncompleteRead(HTTPError, httplib_IncompleteRead):
    partial: int  # type ignore[override]
    expected: int

nox -rs lint reports no errors , and nox -rs test-3.12 only fail on test_unexpected_body which is added by this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to redefine urllib3.exceptions.IncompleteRead.partial

@ecerulm how does your suggestion differ from my change I committed in 12ce2a7?

nox -rs text-3.12 only fail on test_unexpected_body

Where is the session "text-3.12" defined? After rebasing I keep getting this error:

$ nox -rs text-3.12
nox > Error while collecting sessions.
nox > Sessions not found: text-3.12
$ python3 --version
Python 3.12.1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's test-3.12 not text-3.12

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ecerulm how does your suggestion differ from my change I committed in 12ce2a7?

The type hints are outside of __init__. Full source code

class IncompleteRead(HTTPError, httplib_IncompleteRead):
    """
    Response length doesn't match expected Content-Length

    Subclass of :class:`http.client.IncompleteRead` to allow int value
    for ``partial`` to avoid creating large objects on streamed reads.
    """

    partial: int  # type: ignore[assignment]
    expected: int

    def __init__(self, partial: int, expected: int) -> None:
        self.partial = partial
        self.expected = expected

    def __repr__(self) -> str:
        return "IncompleteRead(%i bytes read, %i more expected)" % (
            self.partial,
            self.expected,
        )

But , if you asking what it's the difference functionally then I believe they are equivalent although I think this style is preferred. I'll let @illia-v and @sethmlarson decide on that (since I'm fairly new here), but at least is the one that is used in urllib3.connectionpool.ConnectionPool, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ecerulm Yes, being new to this code base, it did not occur to me to think of applying the type annotation this way, because I too believe, that both versions yield the same result.

It's your project with your design guidelines. Let's follow that one. Please see c3cdaed

@ecerulm
Copy link
Contributor

ecerulm commented Feb 5, 2024

@crazyscientist ,can you update this branch to the latest main?

Copy link
Contributor

@ecerulm ecerulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the "Protocol violation:", update the branch to latest main, and change urllib3.exceptions.IncompleteRead use the type hints in the way that @illia-v suggested.

src/urllib3/response.py Outdated Show resolved Hide resolved
if (
e.expected is not None
and e.partial is not None
and int(e.expected) == -int(e.partial)
Copy link
Contributor

@ecerulm ecerulm Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@illia-v , @crazyscientist

reveal_type(http.client.IncompleteRead.partial) gives note: Revealed type is "builtins.bytes",
reveal_type(http.client.IncompleteRead.expected) gives note: Revealed type is "Union[builtins.int, None]"

@crazyscientist to redefine urllib3.exceptions.IncompleteRead.partial you can follow @illia-v advice, I tried locally and it works:

class IncompleteRead(HTTPError, httplib_IncompleteRead):
    partial: int  # type ignore[override]
    expected: int

nox -rs lint reports no errors , and nox -rs test-3.12 only fail on test_unexpected_body which is added by this PR.

crazyscientist and others added 4 commits February 5, 2024 14:08
`HTTPResponse._raw_read` raises `IncompleteRead` if the content length does
not match the expected content length. For malformed responses (e.g. 204
response with content) the re-raised `ProtocolError` was a bit too unclear
about the unexpected excess content being the reason for the exception.

With this change, the exception points out, that the client is not dealing
with a connection error but a protocol violation.

Closes urllib3#3261
Copy link
Member

@illia-v illia-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@illia-v illia-v merged commit fa54179 into urllib3:main Feb 12, 2024
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Misleading IncompleteRead exception, if a HTTP response contains unexpected content
4 participants