Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve overloads and return types for iterparse.__new__ #19

Closed
Daverball opened this issue Jul 20, 2023 · 3 comments · Fixed by #21
Closed

Improve overloads and return types for iterparse.__new__ #19

Daverball opened this issue Jul 20, 2023 · 3 comments · Fixed by #21
Assignees
Labels
enhancement New feature or request

Comments

@Daverball
Copy link

Daverball commented Jul 20, 2023

If we only pass in events that all have the same layout i.e. _NoNSEventNames, then we know that the resulting iterator always yields the event name as the first tuple element and an _Element as the second one. Consider the following example which gives a broad type union that is annoying to deal with

iterator = etree.iterparse(file_path, events=('start', 'end'))
reveal_type(iterator)  # Revealed type is "iterparse[tuple[_SaxEventNames, _Element | tuple[str, str] | None]]"

Adding a new overload for _NoNSEventNames which returns iterparser[tuple[_NoNSEventNames, _Element]] should fix this, but this only solves part of the issue.

It should also be possible to improve things further by using a tagged union, rather than a tuple of unrelated unions, since we know the object that belongs to every event, i.e. rather than doing tuple[_EventUnion, _ValueUnion] you would do:

Union[
    tuple[_NoNSEventNames, _Element],
    tuple[Literal['start-ns'], tuple[str, str]],
    tuple[Literal['end-ns'], None]
]

With the tagged union type checkers can perform type narrowing on the value type when doing a simple string comparison on the event, so this would allow for the following:

iterator = etree.iterparse(file_path, events=('start', 'end', 'start-ns', 'end-ns'))
for event, value in iterator:
    if event == 'start-ns':
        reveal_type(value)  # Revealed type is "tuple[str, str]"
    elif event == 'end-ns':
        reveal_type(value)  # Revealed type is "None"
    else:
        reveal_type(value)  # Revealed type is "_Element"
@abelcheung abelcheung self-assigned this Jul 22, 2023
@abelcheung
Copy link
Owner

I still have some fresh memory having to make an abridged @overloads due to daunting possibility of overload entries. But as you said, the resulting broad union type is indeed hard to deal with.

And yes, tagged union can be useful here, I'm sort of dumbfound it didn't come up while writing annotations. Still taking a few days off computer, when I'm back I'll look into it and make a better compromise between the extremes.

@abelcheung abelcheung added the enhancement New feature or request label Oct 18, 2023
@abelcheung
Copy link
Owner

@Daverball Finally fully remembered what happened when I tried implementing the stubs. It doesn't work as you had hoped.

for event, value in iterator:

At this line, the types of event and value are already determined; event is union of all events and value is union of all value types. So type narrowing won't happen in the following if/else block. mypy 0.9.x era and pyright already behaved like this, and that is still true now, even if the iterator generates union of tuple.

I would incorporate the change you suggested, but even with that, you need to write code in not-so-pythonic way:

for item in iterator:
    if item[0] == 'start-ns':
        reveal_type(item[1])
    elif item[0] == 'end-ns':
        reveal_type(item[1])
> mypy test.py
test.py:186: note: Revealed type is "tuple[builtins.str, builtins.str]"
test.py:188: note: Revealed type is "None"

@Daverball
Copy link
Author

@abelcheung You are correct, it is still an overall improvement though. We can profit from the additional overload, even if we unpack the tuple, before any type narrowing can occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging a pull request may close this issue.

2 participants