Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve adjusted_lines's diff algorithm used in --line-ranges. #4052

Open
yilei opened this issue Nov 18, 2023 · 0 comments
Open

Improve adjusted_lines's diff algorithm used in --line-ranges. #4052

yilei opened this issue Nov 18, 2023 · 0 comments
Labels
T: bug Something isn't working

Comments

@yilei
Copy link
Contributor

yilei commented Nov 18, 2023

Currently, Black performs two formatting passes, and also performs an extra formatting pass for stability check in --safe mode.

When performing a second formatting pass with --line-ranges, we can't simply use the original lines specified by the user. For example, unformatted:

def restrict_to_this_line(arg1,
  arg2,
  arg3):
    print  ( "This should not be formatted." )
    print  ( "This should not be formatted." )

If we let it format lines 1-3, after the first pass it becomes:

def restrict_to_this_line(arg1, arg2, arg3):
    print  ( "This should not be formatted." )
    print  ( "This should not be formatted." )

If we use the original 1-3 lines in a second pass, it would format all the lines now.

To solve this, we have an adjusted_lines function to calculate the new line ranges for the second pass. It uses the diffing algorithm from difflib.SequenceMatcher. Unfortunately it could produce undesired results for certain edge cases like a list of unformatted lines with the same content.

For example:

print ( "format me" )
print ( "format me" )
print ( "format me" )
print ( "format me" )
print ( "format me" )

Using --line-ranges=2-3, the result of the first pass is

print ( "format me" )
print("format me")
print("format me")
print ( "format me" )
print ( "format me" )

adjusted_lines will return 5-5 with the input of 2-3 and the original & formatted code. After the second pass the code becomes:

print ( "format me" )
print("format me")
print("format me")
print ( "format me" )
print("format me")

The end result is

  1. --line-range might format extra lines when there are unformatted lines with the exact same content.
  2. The stability check fails in --safe mode.

In #4034, we are disabling the stability check for now. This issue tracks the improvement of adjusted_lines so that we can eventually fix the underlying issue and re-enable the stability check.

@yilei yilei added the T: bug Something isn't working label Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant