Skip to content

Commit

Permalink
Preview minimal f-string formatting (#9642)
Browse files Browse the repository at this point in the history
## Summary

_This is preview only feature and is available using the `--preview`
command-line flag._

With the implementation of [PEP 701] in Python 3.12, f-strings can now
be broken into multiple lines, can contain comments, and can re-use the
same quote character. Currently, no other Python formatter formats the
f-strings so there's some discussion which needs to happen in defining
the style used for f-string formatting. Relevant discussion:
#9785

The goal for this PR is to add minimal support for f-string formatting.
This would be to format expression within the replacement field without
introducing any major style changes.

### Newlines

The heuristics for adding newline is similar to that of
[Prettier](https://prettier.io/docs/en/next/rationale.html#template-literals)
where the formatter would only split an expression in the replacement
field across multiple lines if there was already a line break within the
replacement field.

In other words, the formatter would not add any newlines unless they
were already present i.e., they were added by the user. This makes
breaking any expression inside an f-string optional and in control of
the user. For example,

```python
# We wouldn't break this
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"

# But, we would break the following as there's already a newline
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
	aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
```


If there are comments in any of the replacement field of the f-string,
then it will always be a multi-line f-string in which case the formatter
would prefer to break expressions i.e., introduce newlines. For example,

```python
x = f"{ # comment
    a }"
```

### Quotes

The logic for formatting quotes remains unchanged. The existing logic is
used to determine the necessary quote char and is used accordingly.

Now, if the expression inside an f-string is itself a string like, then
we need to make sure to preserve the existing quote and not change it to
the preferred quote unless it's 3.12. For example,

```python
f"outer {'inner'} outer"

# For pre 3.12, preserve the single quote
f"outer {'inner'} outer"

# While for 3.12 and later, the quotes can be changed
f"outer {"inner"} outer"
```

But, for triple-quoted strings, we can re-use the same quote char unless
the inner string is itself a triple-quoted string.

```python
f"""outer {"inner"} outer"""  # valid
f"""outer {'''inner'''} outer"""  # preserve the single quote char for the inner string
```

### Debug expressions

If debug expressions are present in the replacement field of a f-string,
then the whitespace needs to be preserved as they will be rendered as it
is (for example, `f"{ x = }"`. If there are any nested f-strings, then
the whitespace in them needs to be preserved as well which means that
we'll stop formatting the f-string as soon as we encounter a debug
expression.

```python
f"outer {   x =  !s  :.3f}"
#                  ^^
#                  We can remove these whitespaces
```

Now, the whitespace doesn't need to be preserved around conversion spec
and format specifiers, so we'll format them as usual but we won't be
formatting any nested f-string within the format specifier.

### Miscellaneous

- The
[`hug_parens_with_braces_and_square_brackets`](#8279)
preview style isn't implemented w.r.t. the f-string curly braces.
- The
[indentation](#9785 (comment))
is always relative to the f-string containing statement

## Test Plan

* Add new test cases
* Review existing snapshot changes
* Review the ecosystem changes

[PEP 701]: https://peps.python.org/pep-0701/
  • Loading branch information
dhruvmanila committed Feb 16, 2024
1 parent c47ff65 commit 72bf1c2
Show file tree
Hide file tree
Showing 20 changed files with 1,973 additions and 56 deletions.
@@ -0,0 +1,8 @@
[
{
"preview": "enabled"
},
{
"preview": "disabled"
}
]
Expand Up @@ -30,30 +30,30 @@
# an expression inside a formatted value
(
f'{1}'
# comment
# comment 1
''
)

(
f'{1}' # comment
f'{1}' # comment 2
f'{2}'
)

(
f'{1}'
f'{2}' # comment
f'{2}' # comment 3
)

(
1, ( # comment
1, ( # comment 4
f'{2}'
)
)

(
(
f'{1}'
# comment
# comment 5
),
2
)
Expand All @@ -62,3 +62,221 @@
x = f'''a{""}b'''
y = f'''c{1}d"""e'''
z = f'''a{""}b''' f'''c{1}d"""e'''

# F-String formatting test cases (Preview)

# Simple expression with a mix of debug expression and comments.
x = f"{a}"
x = f"{
a = }"
x = f"{ # comment 6
a }"
x = f"{ # comment 7
a = }"

# Remove the parentheses as adding them doesn't make then fit within the line length limit.
# This is similar to how we format it before f-string formatting.
aaaaaaaaaaa = (
f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd } cccccccccc"
)
# Here, we would use the best fit layout to put the f-string indented on the next line
# similar to the next example.
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
aaaaaaaaaaa = (
f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
)

# This should never add the optional parentheses because even after adding them, the
# f-string exceeds the line length limit.
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { "bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" } ccccccccccccccc"
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { "bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" = } ccccccccccccccc"
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { # comment 8
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" } ccccccccccccccc"
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { # comment 9
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" = } ccccccccccccccc"

# Multiple larger expressions which exceeds the line length limit. Here, we need to decide
# whether to split at the first or second expression. This should work similarly to the
# assignment statement formatting where we split from right to left in preview mode.
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb } cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"

# The above example won't split but when we start introducing line breaks:
x = f"aaaaaaaaaaaa {
bbbbbbbbbbbbbb } cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb
} cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb } cccccccccccccccccccc {
ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb } cccccccccccccccccccc { ddddddddddddddd
} eeeeeeeeeeeeee"

# But, in case comments are present, we would split at the expression containing the
# comments:
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb # comment 10
} cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb
} cccccccccccccccccccc { # comment 11
ddddddddddddddd } eeeeeeeeeeeeee"

# Here, the expression part itself starts with a curly brace so we need to add an extra
# space between the opening curly brace and the expression.
x = f"{ {'x': 1, 'y': 2} }"
# Although the extra space isn't required before the ending curly brace, we add it for
# consistency.
x = f"{ {'x': 1, 'y': 2}}"
x = f"{ {'x': 1, 'y': 2} = }"
x = f"{ # comment 12
{'x': 1, 'y': 2} }"
x = f"{ # comment 13
{'x': 1, 'y': 2} = }"

# But, in this case, we would split the expression itself because it exceeds the line
# length limit so we need not add the extra space.
xxxxxxx = f"{
{'aaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbb', 'ccccccccccccccccccccc'}
}"
# And, split the expression itself because it exceeds the line length.
xxxxxxx = f"{
{'aaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbb', 'cccccccccccccccccccccccccc'}
}"

# Quotes
f"foo 'bar' {x}"
f"foo \"bar\" {x}"
f'foo "bar" {x}'
f'foo \'bar\' {x}'
f"foo {"bar"}"
f"foo {'\'bar\''}"

# Here, the formatter will remove the escapes which is correct because they aren't allowed
# pre 3.12. This means we can assume that the f-string is used in the context of 3.12.
f"foo {'\"bar\"'}"


# Triple-quoted strings
# It's ok to use the same quote char for the inner string if it's single-quoted.
f"""test {'inner'}"""
f"""test {"inner"}"""
# But if the inner string is also triple-quoted then we should preserve the existing quotes.
f"""test {'''inner'''}"""

# Magic trailing comma
#
# The expression formatting will result in breaking it across multiple lines with a
# trailing comma but as the expression isn't already broken, we will remove all the line
# breaks which results in the trailing comma being present. This test case makes sure
# that the trailing comma is removed as well.
f"aaaaaaa {['aaaaaaaaaaaaaaa', 'bbbbbbbbbbbbb', 'ccccccccccccccccc', 'ddddddddddddddd', 'eeeeeeeeeeeeee']} aaaaaaa"

# And, if the trailing comma is already present, we still need to remove it.
f"aaaaaaa {['aaaaaaaaaaaaaaa', 'bbbbbbbbbbbbb', 'ccccccccccccccccc', 'ddddddddddddddd', 'eeeeeeeeeeeeee',]} aaaaaaa"

# Keep this Multiline by breaking it at the square brackets.
f"""aaaaaa {[
xxxxxxxx,
yyyyyyyy,
]} ccc"""

# Add the magic trailing comma because the elements don't fit within the line length limit
# when collapsed.
f"aaaaaa {[
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
yyyyyyyyyyyy
]} ccccccc"

# Remove the parenthese because they aren't required
xxxxxxxxxxxxxxx = (
f"aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbb {
xxxxxxxxxxx # comment 14
+ yyyyyyyyyy
} dddddddddd"
)

# Comments

# No comments should be dropped!
f"{ # comment 15
# comment 16
foo # comment 17
# comment 18
}" # comment 19
# comment 20

# Conversion flags
#
# This is not a valid Python code because of the additional whitespace between the `!`
# and conversion type. But, our parser isn't strict about this. This should probably be
# removed once we have a strict parser.
x = f"aaaaaaaaa { x ! r }"

# Even in the case of debug expresions, we only need to preserve the whitespace within
# the expression part of the replacement field.
x = f"aaaaaaaaa { x = ! r }"

# Combine conversion flags with format specifiers
x = f"{x = ! s
:>0
}"
# This is interesting. There can be a comment after the format specifier but only if it's
# on it's own line. Refer to https://github.com/astral-sh/ruff/pull/7787 for more details.
# We'll format is as trailing comments.
x = f"{x !s
:>0
# comment 21
}"

x = f"""
{ # comment 22
x = :.0{y # comment 23
}f}"""

# Here, the debug expression is in a nested f-string so we should start preserving
# whitespaces from that point onwards. This means we should format the outer f-string.
x = f"""{"foo " + # comment 24
f"{ x =
}" # comment 25
}
"""

# Mix of various features.
f"{ # comment 26
foo # after foo
:>{
x # after x
}
# comment 27
# comment 28
} woah {x}"

# Indentation

# What should be the indentation?
# https://github.com/astral-sh/ruff/discussions/9785#discussioncomment-8470590
if indent0:
if indent1:
if indent2:
foo = f"""hello world
hello {
f"aaaaaaa {
[
'aaaaaaaaaaaaaaaaaaaaa',
'bbbbbbbbbbbbbbbbbbbbb',
'ccccccccccccccccccccc',
'ddddddddddddddddddddd'
]
} bbbbbbbb" +
[
'aaaaaaaaaaaaaaaaaaaaa',
'bbbbbbbbbbbbbbbbbbbbb',
'ccccccccccccccccccccc',
'ddddddddddddddddddddd'
]
} --------
"""
@@ -0,0 +1,5 @@
[
{
"target_version": "py312"
}
]
@@ -0,0 +1,6 @@
# This file contains test cases only for cases where the logic tests for whether
# the target version is 3.12 or later. A user can have 3.12 syntax even if the target
# version isn't set.

# Quotes re-use
f"{'a'}"
12 changes: 11 additions & 1 deletion crates/ruff_python_formatter/src/builders.rs
@@ -1,7 +1,7 @@
use ruff_formatter::{write, Argument, Arguments};
use ruff_text_size::{Ranged, TextRange, TextSize};

use crate::context::{NodeLevel, WithNodeLevel};
use crate::context::{FStringState, NodeLevel, WithNodeLevel};
use crate::other::commas::has_magic_trailing_comma;
use crate::prelude::*;

Expand Down Expand Up @@ -206,6 +206,16 @@ impl<'fmt, 'ast, 'buf> JoinCommaSeparatedBuilder<'fmt, 'ast, 'buf> {

pub(crate) fn finish(&mut self) -> FormatResult<()> {
self.result.and_then(|()| {
// If the formatter is inside an f-string expression element, and the layout
// is flat, then we don't need to add a trailing comma.
if let FStringState::InsideExpressionElement(context) =
self.fmt.context().f_string_state()
{
if context.layout().is_flat() {
return Ok(());
}
}

if let Some(last_end) = self.entries.position() {
let magic_trailing_comma = has_magic_trailing_comma(
TextRange::new(last_end, self.sequence_end),
Expand Down
22 changes: 22 additions & 0 deletions crates/ruff_python_formatter/src/comments/placement.rs
Expand Up @@ -289,6 +289,28 @@ fn handle_enclosed_comment<'a>(
}
}
AnyNodeRef::FString(fstring) => CommentPlacement::dangling(fstring, comment),
AnyNodeRef::FStringExpressionElement(_) => {
// Handle comments after the format specifier (should be rare):
//
// ```python
// f"literal {
// expr:.3f
// # comment
// }"
// ```
//
// This is a valid comment placement.
if matches!(
comment.preceding_node(),
Some(
AnyNodeRef::FStringExpressionElement(_) | AnyNodeRef::FStringLiteralElement(_)
)
) {
CommentPlacement::trailing(comment.enclosing_node(), comment)
} else {
handle_bracketed_end_of_line_comment(comment, locator)
}
}
AnyNodeRef::ExprList(_)
| AnyNodeRef::ExprSet(_)
| AnyNodeRef::ExprListComp(_)
Expand Down

0 comments on commit 72bf1c2

Please sign in to comment.