Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type hints for _collections.py #2157

Conversation

franekmagiera
Copy link
Member

@franekmagiera franekmagiera commented Feb 10, 2021

Related to #1897

Took a stab at adding type hints for _collections.py.

A couple of things to mention:

  • RLock is ignored because of issues with conditional imports (mypy bug with try/except conditional imports python/mypy#1153)
  • RecentlyUsedContainer.keys is ignored because in order to be consistent with base class MutableMapping and not break Liskov substitution principle (which mypy complains about) it would have to return AbstractSet which does not preserve ordering of the keys. (Same for HTTPHeaderDict.keys)
  • Deleted the return statement in HTTPHeaderDict.__setitem__ to not break LSP (similar situation as above).
  • For HTTPHeaderDict.__contains__ it would make sense for the key to be of type str, but this would again violate LSP, so I set the key type to object and added a check inside of the method (if you will be fine with this solution I will add a test for __contains__).
  • Methods pop and getlist use __marker as default, which means the default type has to be a Union of object and str or List[str]. Not sure if it would make sense to use "" and [] as defaults, for now I am just casting the values.
  • Finally, I am not exactly sure what the best type for headers would be. For now I decided to use Union["HTTPHeaderDict", Mapping[str, str], Iterator[Tuple[str, str]]]. Judging by the tests for this class, it would also make sense to include something like NonMappingHeaderContainer that does not implement everything that Mapping does, but just has the keys and __getitem__ attributes. I tried defining an abstract base class to act as an interface and add it to the union but it did not do much - I still had to ignore lines 293 and 294 to not get an error. Not sure if it is connected to Type inference and hasattr python/mypy#1424 (maybe it would be better to cast as this issue indicates), or if this is something that could be easily fixed.

@codecov
Copy link

codecov bot commented Feb 10, 2021

Codecov Report

Merging #2157 (f21bd39) into main (6711102) will decrease coverage by 0.04%.
The diff coverage is 99.34%.

Impacted file tree graph

@@             Coverage Diff             @@
##              main    #2157      +/-   ##
===========================================
- Coverage   100.00%   99.95%   -0.05%     
===========================================
  Files           25       25              
  Lines         2219     2213       -6     
===========================================
- Hits          2219     2212       -7     
- Misses           0        1       +1     
Impacted Files Coverage Δ
src/urllib3/__init__.py 100.00% <ø> (ø)
src/urllib3/util/__init__.py 100.00% <ø> (ø)
src/urllib3/_collections.py 99.32% <97.77%> (-0.68%) ⬇️
src/urllib3/connection.py 100.00% <100.00%> (ø)
src/urllib3/connectionpool.py 100.00% <100.00%> (ø)
src/urllib3/exceptions.py 100.00% <100.00%> (ø)
src/urllib3/fields.py 100.00% <100.00%> (ø)
src/urllib3/filepost.py 100.00% <100.00%> (ø)
src/urllib3/util/ssl_.py 100.00% <100.00%> (ø)
src/urllib3/util/url.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6711102...3207e11. Read the comment docs.

@pquentin
Copy link
Member

@franekmagiera Thank you for this! There's already an existing pull request for collections: #2115. Can you take a look and see what you've done differently?

@franekmagiera
Copy link
Member Author

Oh, it's a shame I did not notice that PR. Not that many differences, except some things I mentioned are already implemented.

In #2115 RLock is imported from threading if TYPE_CHECKING is true, I think it is nice to use it for type checking and have the current logic for cases where threading is not available.

RecentlyUsedContainer.__setitem__ - I changed the _Null to None, defined evicted_value as Optional[_VT] and used get instead of pop. The difference is that try/catch is not needed.

For RecentlyUsedContainer.keys I ignore the return type so I don't have to return a Set. In #2115 Set is returned to comply with LSP, but tests are changed so that internal _container is used which is not ideal. Maybe it would be better to have something like a view that is in #2115 for HTTPHeaderDict but not sure if it wouldn't be too much overhead.

My HTTPHeaderDict.__contains__ is simpler but not that type safe as in #2115

ValidHTTPSource, HasGettableStringKeys and HTTPHeaderDictItemView from #2115 are nice abstractions that are lacking here.

Not sure what you want to do with this PR. As this is mostly duplicate work I am ok with just closing it.

@pquentin
Copy link
Member

Thanks for telling us the precise differences!

@haikuginger What are your thoughts on this?

@haikuginger
Copy link
Contributor

@pquentin thoughts inline:

RecentlyUsedContainer.__setitem__ - I changed the _Null to None, defined evicted_value as Optional[_VT] and used get instead of pop. The difference is that try/catch is not needed.

The try/catch construct was a suggestion of @SethMichaelLarson, IIRC. I actually like this structure because it encodes the reordering of the data structure on insert more explicitly than the .get() flow does—the flow for this version depends on the ABC's .get() implementation using our custom __getitem__ that does an internal pop/reinsert. In fact, the ABC will do the same try/except KeyError in its .get() implementation that we handle ourselves in this case.

For RecentlyUsedContainer.keys I ignore the return type so I don't have to return a Set. In #2115 Set is returned to comply with LSP, but tests are changed so that internal _container is used which is not ideal. Maybe it would be better to have something like a view that is in #2115 for HTTPHeaderDict but not sure if it wouldn't be too much overhead.

I need to do some tweaking on HTTPHeaderDictItemView because it doesn't currently check the right things in __contains__, but I would generally prefer to conform to the ABC's interface and avoid # type: ignore whenever possible to avoid introducing accidental Anys and ensure that we can pass into anything expecting a MutableMapping. _container is used in testing to avoid the problem of the lack of an order guarantee when comparing against a set-like item without introducing an interface that specifically commits to a particular order.

My HTTPHeaderDict.__contains__ is simpler but not that type safe as in #2115

It looks equally correct to me...

ValidHTTPSource, HasGettableStringKeys and HTTPHeaderDictItemView from #2115 are nice abstractions that are lacking here.

I think the lack of ValidHttpHeaderSource here (in particular, the lack of HasGettableStringKeys results in weaker overall typing here and could result in existing behavior breaking.

Overall, there's definite overlap; I'd like to proceed with my version and get it to a mergeable state.

@sethmlarson
Copy link
Member

I'm in complete agreement with @haikuginger, thanks all for figuring out the best way forwards!

@sethmlarson
Copy link
Member

sethmlarson commented Feb 11, 2021

@franekmagiera Hopefully this doesn't discourage you one bit, your work is very high quality and we all hope you are willing to provide type coverage or other improvements! 🙌 Thanks again for this contribution!!

@franekmagiera
Copy link
Member Author

No worries, things like this happen ;) It's my mistake that I did not notice that @haikuginger already started working on this. Still, it was a good chance to brush up on type hinting.

@pquentin
Copy link
Member

Awesome, thank you all! @haikuginger When your PR is ready, can you ask @franekmagiera to review it? They're now an expert in collections.py type hints too. :D

@franekmagiera Would you like to work on typing another part of urllib3? Not collections, exceptions or response, though!

@franekmagiera
Copy link
Member Author

@pquentin I am playing around with type hints for connection.py. If I get somewhere with it, I'll post a PR.

@haikuginger
Copy link
Contributor

@pquentin I think I can only request a review from within the organization, but I'd be happy to take a review from @franekmagiera now if they'd like to add it!

@franekmagiera franekmagiera deleted the 1897-add-type-hints-collections branch April 13, 2021 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants