Skip to content

Commit

Permalink
Better error recovery for match patterns (#10477)
Browse files Browse the repository at this point in the history
## Summary

This PR updates the match statement's pattern parsing logic to better
facilitate the error recovery. The following changes are made:

### Disallow star pattern (`*<name>`) outside of sequence pattern

This is done by passing an enum stating whether to allow star pattern or
not and is based on where the parser is in the stack.

<details><summary>Call stack for pattern parsing functions:</summary>
<p>

I took notes on where would I requires changing this, so sharing it for
documenting purposes:

* `parse_match_patterns`
  * `parse_match_pattern` (`AllowStarPattern::Yes`)
    * Error if it’s a star pattern and it's not a sequence pattern
  * `parse_sequence_match_pattern`
    * `parse_match_pattern` (`AllowStarPattern::Yes`)
* `parse_match_pattern` (`allow_star_pattern: AllowStarPattern`)
  * `parse_match_pattern_lhs` (`allow_star_pattern: AllowStarPattern`)
    * `parse_match_pattern_mapping`
      * `parse_match_pattern_lhs` (`AllowStarPattern::No`)
      * `parse_match_pattern` (`AllowStarPattern::No`)
    * `parse_match_pattern_star`
    * `parse_delimited_match_pattern`
      * `parse_match_pattern` (`AllowStarPattern::Yes`)
      * `parse_sequence_match_pattern`
    * `parse_match_pattern_literal`
      * `parse_attr_expr_for_match_pattern`
    * `parse_match_pattern_class`
      * `parse_match_pattern` (`AllowStarPattern::No`)
    * `parse_complex_literal_pattern`
      * `parse_match_pattern_lhs` (`AllowStarPattern::No`)

</p>
</details> 

The mapping pattern `FIRST` set has been updated to include `*` to allow
the parser to recover from this error in mapping pattern. Otherwise, the
list parsing would completely skip parsing it.

### Convert `Pattern` to `Expr`

In certain cases, the parser expects a specific type of pattern and uses
that to extract information. For example, in complex literal, we parse
the LHS and RHS as patterns but the LHS can only be a real number while
the RHS can only be a complex number. This means that if the parser
found any other pattern kind then we would convert the pattern into an
equivalent expression and add an appropriate error. This will help any
downstream tools.

Refer to `recovery::pattern_to_expr` for more details.

### Extract complex number literal pattern parsing logic

Earlier, this was inside the `parse_match_pattern_lhs` which made the
method a bit more complex than it needed to be. The logic has been
extracted to a new `parse_complex_pattern_literal` and is called when
the parser encounters a pattern followed by either `+` or `-`.

The parser only allows a valid complex literal pattern which is
`real_number +/- complex_number`. For any other case, an appropriate
error will be raised.

## Review

Refer to the test cases to help in the review process.

## Test Plan

Add test cases and update the snapshots.
  • Loading branch information
dhruvmanila committed Mar 25, 2024
1 parent 45def95 commit 236aa4a
Show file tree
Hide file tree
Showing 27 changed files with 4,756 additions and 813 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
match subject:
# Parser shouldn't confuse this as being a
# class pattern
# v
case (x as y)(a, b):
# ^^^^^^
# as-pattern
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
match subject:
# Parser shouldn't confuse this as being a
# complex literal pattern
# v
case (x as y) + 1j:
# ^^^^^^
# as-pattern
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
match subject:
# This `as` pattern is unparenthesied so the parser never takes the path
# where it might be confused as a complex literal pattern.
case x as y + 1j:
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
match subject:
# Not in the mapping start token set, so the list parsing bails
# v
case {(x as y): 1}:
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
match subject:
# This `as` pattern is unparenthesized so the parser never takes the path
# where it might be confused as a mapping key pattern.
case {x as y: 1}:
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Invalid keyword pattern in class argument
match subject:
case Foo(x as y = 1):
pass
case Foo(x | y = 1):
pass
case Foo([x, y] = 1):
pass
case Foo({False: 0} = 1):
pass
case Foo(1=1):
pass
case Foo(Bar()=1):
pass
# Positional pattern cannot follow keyword pattern
# case Foo(x, y=1, z):
# pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
match invalid_lhs_pattern:
case Foo() + 1j:
pass
case x + 2j:
pass
case _ + 3j:
pass
case (1 | 2) + 4j:
pass
case [1, 2] + 5j:
pass
case {True: 1} + 6j:
pass
case 1j + 2j:
pass
case -1j + 2j:
pass

match invalid_rhs_pattern:
case 1 + Foo():
pass
case 2 + x:
pass
case 3 + _:
pass
case 4 + (1 | 2):
pass
case 5 + [1, 2]:
pass
case 6 + {True: 1}:
pass
case 1 + 2:
pass

match invalid_lhs_rhs_pattern:
case Foo() + Bar():
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Starred expression is not allowed as a mapping pattern key
match subject:
case {*key}:
pass
case {*key: 1}:
pass
case {*key 1}:
pass
case {*key, None: 1}:
pass

# Pattern cannot follow a double star pattern
# Multiple double star patterns are not allowed
match subject:
case {**rest, None: 1}:
pass
case {**rest1, **rest2, None: 1}:
pass
case {**rest1, None: 1, **rest2}:
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Star pattern is only allowed inside a sequence pattern
match subject:
case *_:
pass
case *_ as x:
pass
case *foo:
pass
case *foo | 1:
pass
case 1 | *foo:
pass
case Foo(*_):
pass
case Foo(x=*_):
pass
case {*_}:
pass
case {*_: 1}:
pass
case {None: *_}:
pass
case 1 + *_:
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Unary addition isn't allowed but we parse it for better error recovery.
match subject:
case +1:
pass
case 1 | +2 | -3:
pass
case [1, +2, -3]:
pass
case Foo(x=+1, y=-2):
pass
case {True: +1, False: -2}:
pass
7 changes: 5 additions & 2 deletions crates/ruff_python_parser/resources/valid/statement/match.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@
z = 0

match x:
# F-strings aren't allowed as patterns but it's a soft syntax error in Python.
case f"{y}":
pass
match {"test": 1}:
Expand Down Expand Up @@ -240,6 +241,8 @@

# PatternMatchAs
match x:
case a:
...
case a as b:
...
case 1 | 2 as two:
Expand Down Expand Up @@ -272,9 +275,9 @@

# PatternMatchStar
match x:
case *a:
case *a,:
...
case *_:
case *_,:
...
case [1, 2, *rest]:
...
Expand Down
17 changes: 17 additions & 0 deletions crates/ruff_python_parser/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -117,8 +117,16 @@ pub enum ParseErrorType {
DefaultArgumentError,
/// A simple statement and a compound statement was found in the same line.
SimpleStmtAndCompoundStmtInSameLine,

/// An invalid `match` case pattern was found.
InvalidMatchPatternLiteral { pattern: TokenKind },
/// A star pattern was found outside a sequence pattern.
StarPatternUsageError,
/// Expected a real number for a complex literal pattern.
ExpectedRealNumber,
/// Expected an imaginary number for a complex literal pattern.
ExpectedImaginaryNumber,

/// The parser expected a specific token that was not found.
ExpectedToken {
expected: TokenKind,
Expand Down Expand Up @@ -183,6 +191,15 @@ impl std::fmt::Display for ParseErrorType {
ParseErrorType::InvalidMatchPatternLiteral { pattern } => {
write!(f, "invalid pattern `{pattern:?}`")
}
ParseErrorType::StarPatternUsageError => {
write!(f, "Star pattern cannot be used here")
}
ParseErrorType::ExpectedRealNumber => {
write!(f, "Expected a real number in complex literal pattern")
}
ParseErrorType::ExpectedImaginaryNumber => {
write!(f, "Expected an imaginary number in complex literal pattern")
}
ParseErrorType::UnexpectedIndentation => write!(f, "unexpected indentation"),
ParseErrorType::InvalidAssignmentTarget => write!(f, "invalid assignment target"),
ParseErrorType::InvalidNamedAssignmentTarget => {
Expand Down
2 changes: 1 addition & 1 deletion crates/ruff_python_parser/src/parser/expression.rs
Original file line number Diff line number Diff line change
Expand Up @@ -709,7 +709,7 @@ impl<'src> Parser<'src> {
})
}

fn parse_unary_expression(&mut self) -> ast::ExprUnaryOp {
pub(super) fn parse_unary_expression(&mut self) -> ast::ExprUnaryOp {
let start = self.node_start();

let op = UnaryOp::try_from(self.current_token_kind())
Expand Down
12 changes: 10 additions & 2 deletions crates/ruff_python_parser/src/parser/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ mod expression;
mod helpers;
mod pattern;
mod progress;
mod recovery;
mod statement;
#[cfg(test)]
mod tests;
Expand Down Expand Up @@ -988,8 +989,15 @@ impl RecoveryContextKind {
| RecoveryContextKind::SetElements
| RecoveryContextKind::TupleElements(_) => p.at_expr(),
RecoveryContextKind::DictElements => p.at(TokenKind::DoubleStar) || p.at_expr(),
RecoveryContextKind::SequenceMatchPattern(_) => p.at_pattern_start(),
RecoveryContextKind::MatchPatternMapping => p.at_mapping_pattern_start(),
RecoveryContextKind::SequenceMatchPattern(_) => {
// `+` doesn't start any pattern but is here for better error recovery.
p.at(TokenKind::Plus) || p.at_pattern_start()
}
RecoveryContextKind::MatchPatternMapping => {
// A star pattern is invalid as a mapping key and is here only for
// better error recovery.
p.at(TokenKind::Star) || p.at_mapping_pattern_start()
}
RecoveryContextKind::MatchPatternClassArguments => p.at_pattern_start(),
RecoveryContextKind::Arguments => p.at_expr(),
RecoveryContextKind::DeleteTargets => p.at_expr(),
Expand Down

0 comments on commit 236aa4a

Please sign in to comment.