Skip to content

Commit

Permalink
Fix stylist indentation with a formfeed (#7489)
Browse files Browse the repository at this point in the history
**Summary** In python, a formfeed is technically undefined behaviour
(https://docs.python.org/3/reference/lexical_analysis.html#indentation):
> A formfeed character may be present at the start of the line; it will
be ignored for
> the indentation calculations above. Formfeed characters occurring
elsewhere in the
> leading whitespace have an undefined effect (for instance, they may
reset the space
> count to zero).

In practice, they just reset the indentation:


https://github.com/python/cpython/blob/df8b3a46a7aa369f246a09ffd11ceedf1d34e921/Parser/tokenizer.c#L1819-L1821
https://github.com/astral-sh/ruff/blob/a41bb2733fe75a71f4cf6d4bb21e659fc4630b30/crates/ruff_python_parser/src/lexer.rs#L664-L667

The stylist didn't handle formfeeds previously and would produce invalid
indents. The remedy is to cut everything before a form feed.

Checks box for
#7455 (comment)

**Test Plan** Unit test for the stylist and a regression test for the
rule
  • Loading branch information
konstin committed Sep 19, 2023
1 parent ef34c5c commit 94b68f2
Show file tree
Hide file tree
Showing 4 changed files with 62 additions and 1 deletion.
9 changes: 9 additions & 0 deletions crates/ruff/resources/test/fixtures/flake8_bugbear/B006_4.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# formfeed indent
# https://github.com/astral-sh/ruff/issues/7455#issuecomment-1722458825
# This is technically a stylist bug (and has a test there), but it surfaced in B006


class FormFeedIndent:
def __init__(self, a=[]):
print(a)

1 change: 1 addition & 0 deletions crates/ruff/src/rules/flake8_bugbear/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ mod tests {
#[test_case(Rule::MutableArgumentDefault, Path::new("B006_1.py"))]
#[test_case(Rule::MutableArgumentDefault, Path::new("B006_2.py"))]
#[test_case(Rule::MutableArgumentDefault, Path::new("B006_3.py"))]
#[test_case(Rule::MutableArgumentDefault, Path::new("B006_4.py"))]
#[test_case(Rule::NoExplicitStacklevel, Path::new("B028.py"))]
#[test_case(Rule::RaiseLiteral, Path::new("B016.py"))]
#[test_case(Rule::RaiseWithoutFromInsideExcept, Path::new("B904.py"))]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
source: crates/ruff/src/rules/flake8_bugbear/mod.rs
---
B006_4.py:7:26: B006 [*] Do not use mutable data structures for argument defaults
|
6 | class FormFeedIndent:
7 | def __init__(self, a=[]):
| ^^ B006
8 | print(a)
|
= help: Replace with `None`; initialize within function

Possible fix
4 4 |
5 5 |
6 6 | class FormFeedIndent:
7 |- def __init__(self, a=[]):
7 |+ def __init__(self, a=None):
8 |+ if a is None:
9 |+ a = []
8 10 | print(a)
9 11 |


29 changes: 28 additions & 1 deletion crates/ruff_python_codegen/src/stylist.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,21 @@ fn detect_indention(tokens: &[LexResult], locator: &Locator) -> Indentation {
});

if let Some(indent_range) = indent_range {
let whitespace = locator.slice(*indent_range);
let mut whitespace = locator.slice(*indent_range);
// https://docs.python.org/3/reference/lexical_analysis.html#indentation
// > A formfeed character may be present at the start of the line; it will be ignored for
// > the indentation calculations above. Formfeed characters occurring elsewhere in the
// > leading whitespace have an undefined effect (for instance, they may reset the space
// > count to zero).
// So there's UB in python lexer -.-
// In practice, they just reset the indentation:
// https://github.com/python/cpython/blob/df8b3a46a7aa369f246a09ffd11ceedf1d34e921/Parser/tokenizer.c#L1819-L1821
// https://github.com/astral-sh/ruff/blob/a41bb2733fe75a71f4cf6d4bb21e659fc4630b30/crates/ruff_python_parser/src/lexer.rs#L664-L667
// We also reset the indentation when we see a formfeed character.
// See also https://github.com/astral-sh/ruff/issues/7455#issuecomment-1722458825
if let Some((_before, after)) = whitespace.rsplit_once('\x0C') {
whitespace = after;
}

Indentation(whitespace.to_string())
} else {
Expand Down Expand Up @@ -228,6 +242,19 @@ x = (
Stylist::from_tokens(&tokens, &locator).indentation(),
&Indentation::default()
);

// formfeed indent, see `detect_indention` comment.
let contents = r#"
class FormFeedIndent:
def __init__(self, a=[]):
print(a)
"#;
let locator = Locator::new(contents);
let tokens: Vec<_> = lex(contents, Mode::Module).collect();
assert_eq!(
Stylist::from_tokens(&tokens, &locator).indentation(),
&Indentation(" ".to_string())
);
}

#[test]
Expand Down

0 comments on commit 94b68f2

Please sign in to comment.