Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use empty range when there's "gap" in token source #11032

Merged
merged 1 commit into from Apr 19, 2024
Merged

Conversation

dhruvmanila
Copy link
Member

Summary

This fixes a bug where the parser would panic when there is a "gap" in
the token source.

What's a gap?

The reason it's <= instead of just == is because there could be whitespaces between
the two tokens. For example:

#     last token end
#     | current token (newline) start
#     v v
def foo \n
#      ^
#      assume there's trailing whitespace here

Or, there could tokens that are considered "trivia" and thus aren't emitted by the token
source. These are comments and non-logical newlines. For example:

#     last token end
#     v
def foo # comment\n
#                ^ current token (newline) start

In either of the above cases, there's a "gap" between the end of the last token and start
of the current token.

Test Plan

Add test cases and update the snapshots.

Copy link
Contributor

github-actions bot commented Apr 19, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+0 -1 violations, +0 -0 fixes in 1 projects; 43 projects unchanged)

python/typeshed (+0 -1 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select E,F,FA,I,PYI,RUF,UP,W

- stdlib/pathlib.pyi:111:89: E999 SyntaxError: Expected ':', found newline

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
E999 1 0 1 0 0

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

Comment on lines +264 to +268
// It's possible during error recovery that the parsing didn't consume any tokens. In that
// case, `last_token_end` still points to the end of the previous token but `start` is the
// start of the current token. Calling `TextRange::new(start, self.last_token_end)` would
// panic in that case because `start > end`. This path "detects" this case and creates an
// empty range instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: i think this is specific to error recovery and instead is true for all "empty" nodes that don't consist of any tokens?

If that's the case, then I think we can remove the missing_node_range method that was only added to handle empty node ranges.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think missing_node_range is still useful in places where you don't have the start value or rather the start value itself is the current token start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working parser Related to the parser
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants