Add PEP701 support #3822

tusharsadhwani · 2023-07-29T18:18:14Z

Description

Adds support for PEP 701: Syntactic formalization of f-strings to black's tokenizer.

Resolves #3746

Given this Python file:

x = f"foo{2 + 2}bar"

Previous output:

$ python src/blib2to3/pgen2/tokenize.py asd.py
1,0-1,1:        NAME    'x'
1,2-1,3:        OP      '='
1,4-1,20:       STRING  'f"foo{2 + 2}bar"'
1,20-1,21:      NEWLINE '\n'
2,0-2,0:        ENDMARKER       ''

Current output

$ python src/blib2to3/pgen2/tokenize.py asd.py
1,0-1,1:        NAME    'x'
1,2-1,3:        OP      '='
1,4-1,2:        FSTRING_START   'f"'
1,2-1,5:        FSTRING_MIDDLE  'foo'
1,5-1,6:        LBRACE  '{'
1,6-1,7:        NUMBER  '2'
1,8-1,9:        OP      '+'
1,10-1,11:      NUMBER  '2'
1,12-1,13:      RBRACE  '}'
1,12-1,15:      FSTRING_MIDDLE  'bar'
1,15-1,20:      FSTRING_END     '"'
1,20-1,21:      NEWLINE '\n'
2,0-2,0:        ENDMARKER       ''

Checklist - did you ...

Add an entry in CHANGES.md if necessary?
Add / update tests if necessary?
Add new / update outdated documentation?

for more information, see https://pre-commit.ci

tusharsadhwani · 2023-07-29T18:31:13Z

@ambv This is the approach that I'm thinking of: defining a tokenize_string generator, which in turn will call generate_tokens() inside it to be able to tokenize Python source inside it.

Does the approach seem ok? If yes, I'll go ahead and implement all the missing features.

tusharsadhwani · 2023-07-30T19:13:06Z

cc @JelleZijlstra as well, for the above comment.

src/blib2to3/pgen2/tokenize.py

tusharsadhwani · 2023-08-02T15:13:16Z

Also, although I don't think this change will be backwards incompatible, should it be possible to keep the old behaviour intact? Should this be behind the experimental flag?

JelleZijlstra · 2023-08-02T15:17:25Z

What aspect won't be backward compatible? Shouldn't we just start parsing code we previously failed on?

tusharsadhwani · 2023-08-02T15:19:22Z

@JelleZijlstra I thought so too. But I'm not 100% confident.

Do we want to preserve parsing failures for 3.11 and before? Or can we just drop the old tokenizer and parser?

ambv · 2023-08-02T15:20:09Z

I think the proposed approach is wrong. You can't use the existing understanding of strings and separately decompose them. You don't know the endquote up front anymore because this is now valid:

f = f"abc{"def"}"

You need to use a stack inside the existing generate_tokens().

JelleZijlstra · 2023-08-02T15:22:47Z

@tusharsadhwani It's fine if we parse this code in earlier Python versions too. Catching syntax errors is not a goal for Black.

tusharsadhwani · 2023-08-02T15:22:51Z

@ambv wouldn't the stack frames made by calling generate_tokens() recursively do the same job of maintaining a stack?

You can think of every call to tokenize_string() as pushing an fstring-mode to the stack, and the generate_tokens() call inside it to pushing a normal-mode to the stack.

ambv · 2023-08-02T15:36:42Z

No, the call stack isn't automatically compatible with the stack we need here.

You need to change how endprog in generate_tokens works for strings. This is where the new stack needs to be.

When you encounter f" then you switch to string literal tokenizing, looking for either " or { as endprog. When you encounter { you add } to the endprog stack and switch back to regular tokenizing... but since the stack is not empty, you watch out for a } to return back to string tokenizing.

That way you can encounter another ", switch to string tokenizing, and add a " as endprog to switch back to regular parsing. And when you reach that ", then you remove it from the stack, switch to regular parsing, but there's still } on the stack so you keep looking for that.

At no point are you iterating a counter like curly_brace_level. The endprog stack depth is the level.

Is that clearer now?

tusharsadhwani · 2023-08-02T16:08:44Z

Yes, it's clearer now.

I'll implement the rest.

github-actions · 2023-09-10T09:57:48Z

diff-shades reports zero changes comparing this PR (ab2f43c) to main (944b99a).

What is this? | Workflow run | diff-shades documentation

tusharsadhwani · 2024-04-08T22:20:58Z

Yeah, haven't had sufficient time to fix that one, tomorrow hopefully!

tusharsadhwani · 2024-04-09T20:13:14Z

'foo \{2+2}' is also broken, the string tokenizing regex needs a revisit.

$ python -m blib2to3.pgen2.tokenize
f'foo \{2+2}'
1,0-1,2:        FSTRING_START   "f'"
1,2-1,12:       FSTRING_MIDDLE  'foo \\{2+2}'
1,12-1,13:      FSTRING_END     "'"
1,13-1,14:      NEWLINE '\n'

tusharsadhwani · 2024-04-14T11:19:12Z

@JelleZijlstra everything in test_fstring.py parses now, and all the tests in that file pass before and after formatting. I think that should cover everything.

JelleZijlstra · 2024-04-16T04:31:40Z

Thank you! Testing on our internal codebase found another crash:

f"""
    WITH {f'''
    {1}_cte AS ()'''}
"""

produces

error: cannot format /Users/jelle/py/black/nested.py: Cannot parse: 3:7:     {1}_cte AS ()'''}

Interestingly this happens only if there is a newline in the inner f-string.

This is Python 3.9 code, so valid before and after PEP 701.

JelleZijlstra · 2024-04-16T04:33:54Z

Other than the crashes though, this looks good and I'm hopeful we can merge it soon.

We should add a preview style feature to enable formatting inside f-strings too, but that can wait for another PR.

tusharsadhwani · 2024-04-16T04:44:30Z

Cool, let me take a look.

tusharsadhwani · 2024-04-22T06:27:53Z

@JelleZijlstra edge case taken care of.

JelleZijlstra · 2024-04-22T07:14:43Z

Thanks! I ran the branch on our company repo and on a venv full of interesting installed packages and found no more bugs. I'll read over the diff one more time and hopefully then we'll be ready.

JelleZijlstra · 2024-04-22T07:31:49Z

tests/data/cases/pep_701.py

+f"{     2      +     2    =    }"
+
+# TODO:
+# f"""foo {


Is it easy to fix this TODO in Black? I checked and this syntax works in 3.11 and with current Black, so it would be a regression if we start failing on it.

This is fixed now.

JelleZijlstra · 2024-04-22T07:33:27Z

tests/data/cases/pep_701.py

+f'{(abc:=10)}'
+
+f"This is a really long string, but just make sure that you reflow fstrings {
+    2+2:d


This is a SyntaxError in 3.12.0 but not 3.12.1, I suppose due to python/cpython#112059. Ran into this because I ran the test suite on 3.12.0 and it failed.

No change requested, but it seems possible this will cause trouble again in the future.

JelleZijlstra · 2024-04-22T07:35:46Z

tests/data/cases/pep_701.py

+    WITH {f'''
+    {1}_cte AS ()'''}
+"""
+


Suggest adding this test case:

f"{ X !r }"

It works already, but good to cover it in the test suite.

This reverts commit 7134754.

tusharsadhwani · 2024-04-22T11:28:16Z

#4321 somehow broke changes in this PR, even though visit_STRING and visit_NUMBER were not touched in this PR. Looking into it cc @hauntsaninja

tusharsadhwani · 2024-04-22T11:40:45Z

Is it OK to just revert that one?

…)"" This reverts commit eb05cd4.

JelleZijlstra · 2024-04-22T15:00:57Z

Pushed a fix for that issue.

JelleZijlstra · 2024-04-22T15:20:10Z

Congratulations, and thank you!

Since this fixes such a commonly encountered issue, I'll make a release soon so people can start using it.

ichard26 · 2024-04-22T15:31:19Z

Congratulations y'all, good work! 🖤

tusharsadhwani and others added 3 commits July 29, 2023 23:45

Add PEP701 support

48ad67c

[pre-commit.ci] auto fixes from pre-commit.com hooks

175942b

for more information, see https://pre-commit.ci

Merge branch 'main' into fstrings-pep701

fb2943b

ambv reviewed Aug 2, 2023

View reviewed changes

src/blib2to3/pgen2/tokenize.py Outdated Show resolved Hide resolved

tusharsadhwani added 10 commits August 14, 2023 00:59

Add FSTRING_START and FSTRING_MIDDLE tokenizing

9e344f4

Support escaping of {{

dbdb02c

typo

5acb397

Merge branch 'main' into fstrings-pep701

20d7497

fix some problems with triple quoted strings

4a69ffa

Add support for FSTRING_MIDDLE and FSTRING_END

ee30cde

bugfix and simplify the regexes

e7b5850

Fix small regex problems

88af1c1

fix newline type

c1ecc14

turn endprog into endprog_stack

644c5cc

tusharsadhwani added 5 commits September 12, 2023 00:11

Support fstrings with no braces

b23cdfd

Add grammar changes

bbbac0a

fix some locations

dadaa64

remove padding from fstring_middle and fstring_end

a57e404

Fix some positions

fff25fb

tusharsadhwani added 2 commits April 14, 2024 16:16

tweak regex to fix edge cases

1ab815b

Merge branch 'main' into fstrings-pep701

324cacb

JelleZijlstra mentioned this pull request Apr 19, 2024

Black doesn't format expressions inside f-strings #567

Open

tusharsadhwani added 3 commits April 22, 2024 11:15

fix edge case with nested multiline strings

019df7b

Merge branch 'main' into fstrings-pep701

40c9890

whitespace

a64939d

JelleZijlstra reviewed Apr 22, 2024

View reviewed changes

tusharsadhwani added 4 commits April 22, 2024 16:24

fix multiline formatspec todo

7df45fb

add another test case

36e04d2

Merge branch 'main' into fstrings-pep701

25941cd

Revert "Remove node-specific logic from visit_default (psf#4321)"

eb05cd4

This reverts commit 7134754.

JelleZijlstra added 2 commits April 22, 2024 07:59

Revert "Revert "Remove node-specific logic from visit_default (psf#4321…

5d727ec

…)"" This reverts commit eb05cd4.

fix

ab2f43c

JelleZijlstra approved these changes Apr 22, 2024

View reviewed changes

JelleZijlstra merged commit 551ede2 into psf:main Apr 22, 2024
46 checks passed

hauntsaninja mentioned this pull request Apr 22, 2024

PEP 701 support breaks stability policy #4324

Closed

tusharsadhwani mentioned this pull request Apr 23, 2024

Prevent wrapping of multiline fstrings in parens #4325

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PEP701 support #3822

Add PEP701 support #3822

tusharsadhwani commented Jul 29, 2023 •

edited

tusharsadhwani commented Jul 29, 2023 •

edited

tusharsadhwani commented Jul 30, 2023

tusharsadhwani commented Aug 2, 2023

JelleZijlstra commented Aug 2, 2023

tusharsadhwani commented Aug 2, 2023

ambv commented Aug 2, 2023 •

edited

JelleZijlstra commented Aug 2, 2023

tusharsadhwani commented Aug 2, 2023 •

edited

ambv commented Aug 2, 2023 •

edited

tusharsadhwani commented Aug 2, 2023

github-actions bot commented Sep 10, 2023 •

edited

tusharsadhwani commented Apr 8, 2024

tusharsadhwani commented Apr 9, 2024

tusharsadhwani commented Apr 14, 2024

JelleZijlstra commented Apr 16, 2024

JelleZijlstra commented Apr 16, 2024

tusharsadhwani commented Apr 16, 2024

tusharsadhwani commented Apr 22, 2024

JelleZijlstra commented Apr 22, 2024

JelleZijlstra Apr 22, 2024

tusharsadhwani Apr 22, 2024

JelleZijlstra Apr 22, 2024

JelleZijlstra Apr 22, 2024

tusharsadhwani commented Apr 22, 2024

tusharsadhwani commented Apr 22, 2024

JelleZijlstra commented Apr 22, 2024

JelleZijlstra commented Apr 22, 2024

ichard26 commented Apr 22, 2024

Add PEP701 support #3822

Add PEP701 support #3822

Conversation

tusharsadhwani commented Jul 29, 2023 • edited

Description

Previous output:

Current output

Checklist - did you ...

tusharsadhwani commented Jul 29, 2023 • edited

tusharsadhwani commented Jul 30, 2023

tusharsadhwani commented Aug 2, 2023

JelleZijlstra commented Aug 2, 2023

tusharsadhwani commented Aug 2, 2023

ambv commented Aug 2, 2023 • edited

JelleZijlstra commented Aug 2, 2023

tusharsadhwani commented Aug 2, 2023 • edited

ambv commented Aug 2, 2023 • edited

tusharsadhwani commented Aug 2, 2023

github-actions bot commented Sep 10, 2023 • edited

tusharsadhwani commented Apr 8, 2024

tusharsadhwani commented Apr 9, 2024

tusharsadhwani commented Apr 14, 2024

JelleZijlstra commented Apr 16, 2024

JelleZijlstra commented Apr 16, 2024

tusharsadhwani commented Apr 16, 2024

tusharsadhwani commented Apr 22, 2024

JelleZijlstra commented Apr 22, 2024

JelleZijlstra Apr 22, 2024

Choose a reason for hiding this comment

tusharsadhwani Apr 22, 2024

Choose a reason for hiding this comment

JelleZijlstra Apr 22, 2024

Choose a reason for hiding this comment

JelleZijlstra Apr 22, 2024

Choose a reason for hiding this comment

tusharsadhwani commented Apr 22, 2024

tusharsadhwani commented Apr 22, 2024

JelleZijlstra commented Apr 22, 2024

JelleZijlstra commented Apr 22, 2024

ichard26 commented Apr 22, 2024

tusharsadhwani commented Jul 29, 2023 •

edited

tusharsadhwani commented Jul 29, 2023 •

edited

ambv commented Aug 2, 2023 •

edited

tusharsadhwani commented Aug 2, 2023 •

edited

ambv commented Aug 2, 2023 •

edited

github-actions bot commented Sep 10, 2023 •

edited