Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1897 add type hints connection #2162

Merged
merged 17 commits into from
Apr 12, 2021

Conversation

franekmagiera
Copy link
Member

Related to #1897
Added type hints to connection.py and util/connection.py.
I also added some type hints for ssl_.py, wait.py and proxy.py, but only for the functions that are used in connection.py. I can complete the rest later or as a part of another PR, I didn't want to do it now because it already feels like there are a lot of changes here and wanted to get some feedback first.

There are still some issues, will comment the code shortly.

Also, I used Mapping[str, str] for headers. I saw recent discussions about using HTTPHeaderDict for requests. I think it would be ok to leave headers as Mapping[str, str] for connections and let some upper layer class (like ConnectionPool or PoolManager) worry about transforming HTTPHeaderDict to Mapping[str, str] - what do you think about that?

@codecov
Copy link

codecov bot commented Feb 15, 2021

Codecov Report

Merging #2162 (9181469) into main (44ee4ad) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 9181469 differs from pull request most recent head 8b51e60. Consider uploading reports for the commit 8b51e60 to get more accurate results
Impacted file tree graph

@@            Coverage Diff            @@
##              main     #2162   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           24        24           
  Lines         2240      2219   -21     
=========================================
- Hits          2240      2219   -21     
Impacted Files Coverage Δ
src/urllib3/poolmanager.py 100.00% <ø> (ø)
src/urllib3/util/proxy.py 100.00% <ø> (ø)
src/urllib3/connection.py 100.00% <100.00%> (ø)
src/urllib3/util/connection.py 100.00% <100.00%> (ø)
src/urllib3/util/ssl_.py 100.00% <100.00%> (ø)
src/urllib3/util/wait.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44ee4ad...8b51e60. Read the comment docs.

@@ -160,22 +211,28 @@ def _new_conn(self):

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 210 - NewConnectionError expects a ConnectionPool - should it then be raised by ConnectionPool or this exception should be modified?


from urllib3.exceptions import LocationParseError

from .wait import wait_for_read

SOCKET_GLOBAL_DEFAULT_TIMEOUT = socket._GLOBAL_DEFAULT_TIMEOUT # type: ignore
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy returns an error - it cannot find socket._GLOBAL_DEFAULT_TIMEOUT, hence type: ignore. Should urllib3's own GLOBAL_DEFAULT_TIMEOUT be defined?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, using a type ignore for this case is fine.


from urllib3.exceptions import LocationParseError

from .wait import wait_for_read

SOCKET_GLOBAL_DEFAULT_TIMEOUT = socket._GLOBAL_DEFAULT_TIMEOUT # type: ignore
SocketOptions = List[Tuple[int, int, int]]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be restricted to tuples of ints or the last value could also be str/bytes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type for socket.socket.setsockopt() appears to be Revealed type is 'def (self: socket.socket, level: builtins.int, optname: builtins.int, value: Union[builtins.int, builtins.bytes]) so do int/bytes for last type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to List[Tuple[int, int, Union[int, bytes]]]

@@ -109,6 +114,9 @@ def assert_fingerprint(cert, fingerprint):
Fingerprint as string of hexdigits, can be interspersed by colons.
"""

if cert is None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably will have to add a test for that.
cert can be None because getpeercert can return None.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getpeercert returns None only for server SSL sockets, so I added a test that directly tests the assert_fingerprint function in test_ssl.py

Copy link
Member

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, lots of improvements! Here's a bundle of comments for you :)


from .util.proxy import create_proxy_ssl_context
from .util.util import to_str

try: # Compiled with SSL?
if TYPE_CHECKING:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't BaseSSLError required for being imported elsewhere? If so we can't have TYPE_CHECKING bounding it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it is required by connectionpool and response. I assumed that type checking won't be performed in environments where there is no SSL, in that case should I just ignore everything in the except branch? Also, there are some issues regarding conditional imports in mypy python/mypy#1153

@@ -55,6 +74,14 @@ class BaseSSLError(BaseException):
_CONTAINS_CONTROL_CHAR_RE = re.compile(r"[^-!#$%&'*+.^_`|~0-9a-zA-Z]")


_DefaultType = Union[bytes, IO[Any], Iterable[bytes], str] # Type for HTTP body.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have a more descriptive global here for HTTP body type

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to HTTPBody

src/urllib3/connection.py Show resolved Hide resolved
src/urllib3/connection.py Show resolved Hide resolved
url: str,
body: Optional[_DefaultType] = None,
headers: Optional[Mapping[str, str]] = None,
*,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encode_chunked parameter wasn't available before, I don't think we should add it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was added to http.client in 3.6. If it is not added then request method violates LSP because of incompatible signatures. I can't think of a workaround that wouldn't require # type: ignore or a refactor. Having two ways to do the same thing is also not ideal. What about refactoring the request method to do all the work including sending chunked requests and then making request_chunked a convenience method for compatibility that would call request?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a case where we want to violate LSP as our API in the future is different from http.client.HTTPConnection's API and our subclassing for now is incidental (see BaseHTTPConnection). I'm comfortable having a type: ignore along with a comment about us intentionally violating LSP

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Will add comments and type ignores.

By the way, do I understand it correctly that in the future when urllib3.HTTPConnection will inherit from BaseHTTPConnection it would have to contain http.client.HTTPConnection (namely, the relationship will be composition instead of inheritance)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@franekmagiera Things will remain how they are for 2.x and then in 3.x we'll be using a different HTTP implementation h2 or h11 instead of only http.client.HTTPConnection. BaseHTTPConnection can be used as a parent class for the new HTTP connections.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comment and type ignore for the request method.

src/urllib3/connection.py Show resolved Hide resolved

from urllib3.exceptions import LocationParseError

from .wait import wait_for_read

SOCKET_GLOBAL_DEFAULT_TIMEOUT = socket._GLOBAL_DEFAULT_TIMEOUT # type: ignore
SocketOptions = List[Tuple[int, int, int]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type for socket.socket.setsockopt() appears to be Revealed type is 'def (self: socket.socket, level: builtins.int, optname: builtins.int, value: Union[builtins.int, builtins.bytes]) so do int/bytes for last type.

cert_reqs: Optional[int] = None,
ca_certs: Optional[str] = None,
ca_cert_dir: Optional[str] = None,
ca_cert_data: Optional[str] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can ca_cert_data be of type bytes? I'm not sure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, typeshed uses cadata: Union[Text, bytes, None] = ...

@@ -33,32 +35,35 @@ def _is_ge_openssl_v1_1_1(
)


try: # Do we have ssl at all?
if TYPE_CHECKING:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if statement can be independent from the try-except below I believe? Basically we still want the from ssl import ... to happen even when type checking

# This test needs to be here in order to be run. socket.create_connection actually tries
# to connect to the host provided so we need a dummyserver to be running.
with HTTPConnectionPool(self.host, self.port, socket_options=None) as pool:
with HTTPConnectionPool(self.host, self.port, socket_options=[]) as pool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it seems we're supporting None as well can we instead parametize this test for both None and []?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameterized the test for None and []

if headers is None:
headers = {}
else:
# Avoid modifying the headers passed into .request()
headers = headers.copy()
headers = copy(headers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid using copy.copy()? Why does mypy balk at headers = headers.copy()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because headers are of type Mapping[str, str] and Mapping does not have a copy method. Changing headers type would again violate LSP, so the same situation as with the code_enchunked discussion. So, probably makes sense to add comments and type ignores again and make headers to be dict for now? This also raises the question if in the BaseHTTPConnection.request should take headers as dict or HTTPHeaderDict (or union of both, which seems best I think). Related to #2158

@franekmagiera
Copy link
Member Author

@sethmlarson Thanks for doing the review! I made some changes, will address the rest tomorrow.

@@ -23,6 +23,8 @@
cast,
)

from typing_extensions import Literal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't have typing_extensions be top-level, instead use if typing.TYPE_CHECKING: from typing import Literal and then string literals "Literal[False]"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, forgot its just a dev dependency... Changed that.

@KapJI
Copy link

KapJI commented Apr 3, 2021

@sethmlarson any updates on this one? Looking forward to see fully typed urllib3 🙂

Copy link
Member

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for all of this work!!! Sorry it took so long for me to review, these days are busy :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants