Discrepancy between `format` and `check` for a long string literal #9926

fjarri · 2024-02-10T21:54:26Z

Ruff version: 0.2.1

Consider the following code:

a = (
    "very very very very very very very very very very very very very very very very very very long line",
)

> ruff check --select E
t.py:2:89: E501 Line too long (106 > 88)
Found 1 error.
> ruff format
1 file left unchanged

So ruff format doesn't find anything to fix, but ruff check still complains about the formatting. Is it intentional? I feel like these two commands should use the same rules when deciding what is a formatting error. But I understand the current approach too (if it was planned) - it's hard to automatically fix that long line; one can split it in two, but the position of splitting can't be chosen without understanding what the literal contains.

The text was updated successfully, but these errors were encountered:

fjarri · 2024-02-10T22:01:26Z

Actually, there is a case where a reformatting can be applied to fix the issue, but isn't.

a = {
    "key": "very very very very very very very very very very very very very long line  "
}

ruff format does not change it, and ruff check complains about the long line (89 characters). But it can be rewritten as

a = {
    "key": (
        "very very very very very very very very very very very very very long line  "
    )
}

Which puts it within 88 characters.

charliermarsh · 2024-02-11T00:08:01Z

Ah yeah, this is intentional and it's discussed a bit in the linter-formatter compatibility section of the docs. In short, the formatter treats --line-length as a preference such that it attempts to format code to adhere to the line length but does not guarantee it, whereas the linter treats it as a rule (with a few exceptions, e.g., for URLs and other unsplittable text). In practice, this is most obvious for comments and documentation (which can exceed the line length, but the formatter won't touch), but can also impact long strings and long identifiers.

Your second comment may be related to #8438 which were some changes we considered making to expression formatting but decided to revisit holistically before making further changes -- though @MichaReiser would be able to say for sure whether he'd expect that reformatting to happen today in latest Ruff (or not).

MichaReiser · 2024-02-11T09:39:34Z

a = {
    "key": (
        "very very very very very very very very very very very very very long line  "
    )
}

Ruff doesn't parenthesize long dict values. Black considers parenthesizing long dictionary values, but this is a preview style. Parenthesizing values in expression positions sets a new precedence in Black's style guide and is something that should be considered carefully (see my feedback on Black's preview style).

Parenthesizing values that can't be split further is an option. It raises the question if the parentheses should also be added if the string, when parenthesized, doesn't fit:

a = {
    "key": (
        "very very very very very very very very very very very very very long long long long line  "
    )
}

IMO, I would always parenthesize it because it can reduce the amount that the string exceeds the line length limit (not guaranteed). However, this "optimisation" is currently implemented for assignments where the string only gets parenthesized if it makes it fit on the line. Unfortunately, this style has a larger impact on runtime performance because, in the worst case, the formatter needs to consider 3 layouts for every string:

unparenthesized
parenthesized
fall-back to unparenthesized

CarlGWatts · 2024-02-21T20:09:41Z

If there is a long string key and a long string value in a dict literal, ruff format won't insert a newline after the ":" in a dict literal even if that means the line exceeds the max line length.

{
    "blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah": "blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah"
}

In fact, if there is a newline after the ":" in a dict literal, ruff format will remove that newline even if that makes the line exceed the max line length.

I consider both of those bugs in ruff format

The python formatter built into PyCharm does the correct thing and inserts a newline after the ":" if necessary to keep the line under the max line length.

{
    "blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah":
        "blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah"
}

gaborbernat · 2024-02-21T21:46:07Z

I'm running into the much more simple case of:

-            line = f"{arg_match} {line[len(arg_match):]}"
+            line = f"{arg_match} {line[len(arg_match) :]}"

Where the diff is formatter the first, and check the second... https://results.pre-commit.ci/run/github/68465360/1708361725.A80kgi75QACdOne_exTHUA @MichaReiser is this a valid bug?

MichaReiser · 2024-02-22T13:49:41Z

@gaborbernat yes, this is a bug (with the lint rule) and tracked in #10041 (comment)

MichaReiser · 2024-02-23T09:59:50Z

@gaborbernat I'm trying to reproduce your diff, but I'm struggling to understand it. What I understand is that.

The formatter produces the second output (+)
The linter fix produces the first output (-)

That the fix's formatting doesn't precisely matches the formatter's isn't intended but accepted. We have plans to format all fixes automatically, but we aren't today. That's why we recommend running ruff check --fix before ruff format, so that the formatter formats all fixes (there are other, more involved fixes that also don't match the formatter's output).

We don't intend to make the E203 fix fully formatter compatible (the check should be compatible, but not the fix). That would require re-implementing the entire formatter logic. In general, we advise disabling stylistic lint rules if you use the formatter because these lints are redundant to the formatter, and the formatter gives you autofixes for all errors. We mainly offer stylistic lints for users who don't want to or can't use our formatter but still want to enforce some stylistic rules. With the caveat that they may need to manually format some fixes.

gaborbernat · 2024-02-23T22:09:55Z

@henryiii told me the opposite, that the formatter should run first and then the checker because the checker is guaranteed to keep the formatters style...

We don't intend to make the E203 fix fully formatter compatible (the check should be compatible, but not the fix). That would require re-implementing the entire formatter logic. In general, we advise disabling stylistic lint rules if you use the formatter because these lints are redundant to the formatter, and the formatter gives you autofixes for all errors

I don't want to manually define a lot of stylistic checks one by one. Do you have a group alias for these? So I can add to the ignore something like FORMATTER to ignore all such rules?

henryiii · 2024-02-23T22:14:25Z

Where did I say that? That sounds reversed from what I was told by @charliermarsh and I think I always put it, the formatter is guaranteed to not introduce new failing checks, but the checker doesn't format its output.

https://github.com/pypa/build/blob/9ceb49db39d81d5e7efc50a371e4612323fbdb80/.pre-commit-config.yaml#L35-L37 for example, follows that. As does https://learn.scientific-python.org/development/guides/style/#format (with "ruff would go here").

gaborbernat · 2024-02-23T22:16:28Z

Perhaps I've read the opposite but is what I remembered 😅

MichaReiser · 2024-02-23T22:18:33Z

I don't want to manually define a lot of stylistic checks one by one. Do you have a group alias for these? So I can add to the ignore something like FORMATTER to ignore all such rules?

Unfortunately not yet but we plan to change this with #1774

gaborbernat · 2024-02-24T21:24:00Z

I don't want to manually define a lot of stylistic checks one by one. Do you have a group alias for these? So I can add to the ignore something like FORMATTER to ignore all such rules?

Unfortunately not yet but we plan to change this with #1774

Anything we can do to help there?

gaborbernat · 2024-02-24T21:24:00Z

I don't want to manually define a lot of stylistic checks one by one. Do you have a group alias for these? So I can add to the ignore something like FORMATTER to ignore all such rules?

Unfortunately not yet but we plan to change this with #1774

Anything we can do to help there?

MichaReiser · 2024-02-25T09:17:38Z

Anything we can do to help there?

Not yet, because we haven't started yet. But we plan to pick this up soon (I think @AlexWaygood would be the perfect person to do it) and your feedback on the categorization would be extremely valuable.

charliermarsh added the formatter Related to the formatter label Feb 11, 2024

MichaReiser added the style How should formatted code look label Feb 23, 2024

timobrembeck mentioned this issue May 6, 2024

Configure ruff import sorting and drop isort support digitalfabrik/integreat-cms#2758

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discrepancy between `format` and `check` for a long string literal #9926

Discrepancy between `format` and `check` for a long string literal #9926

fjarri commented Feb 10, 2024

fjarri commented Feb 10, 2024

charliermarsh commented Feb 11, 2024

MichaReiser commented Feb 11, 2024

CarlGWatts commented Feb 21, 2024 •

edited

gaborbernat commented Feb 21, 2024

MichaReiser commented Feb 22, 2024

MichaReiser commented Feb 23, 2024

gaborbernat commented Feb 23, 2024

henryiii commented Feb 23, 2024

gaborbernat commented Feb 23, 2024

MichaReiser commented Feb 23, 2024

gaborbernat commented Feb 24, 2024

gaborbernat commented Feb 24, 2024

MichaReiser commented Feb 25, 2024

Discrepancy between format and check for a long string literal #9926

Discrepancy between format and check for a long string literal #9926

Comments

fjarri commented Feb 10, 2024

fjarri commented Feb 10, 2024

charliermarsh commented Feb 11, 2024

MichaReiser commented Feb 11, 2024

CarlGWatts commented Feb 21, 2024 • edited

gaborbernat commented Feb 21, 2024

MichaReiser commented Feb 22, 2024

MichaReiser commented Feb 23, 2024

gaborbernat commented Feb 23, 2024

henryiii commented Feb 23, 2024

gaborbernat commented Feb 23, 2024

MichaReiser commented Feb 23, 2024

gaborbernat commented Feb 24, 2024

gaborbernat commented Feb 24, 2024

MichaReiser commented Feb 25, 2024

Discrepancy between `format` and `check` for a long string literal #9926

Discrepancy between `format` and `check` for a long string literal #9926

CarlGWatts commented Feb 21, 2024 •

edited