New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore overlong pragma comments when enforcing linter line length #7692
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
2 changes: 2 additions & 0 deletions
2
crates/ruff_linter/resources/test/fixtures/pycodestyle/E501_1.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
# TODO: comments starting with one of the configured task-tags sometimes are longer than line-length so that you can easily find them with `git grep` | ||
# TODO(charlie): comments starting with one of the configured task-tags sometimes are longer than line-length so that you can easily find them with `git grep` | ||
# TODO comments starting with one of the configured task-tags sometimes are longer than line-length so that you can easily find them with `git grep` | ||
# TODO comments starting with one of the configured task-tags sometimes are longer than line-length so that you can easily find them with `git grep` | ||
# FIXME: comments starting with one of the configured task-tags sometimes are longer than line-length so that you can easily find them with `git grep` | ||
# FIXME comments starting with one of the configured task-tags sometimes are longer than line-length so that you can easily find them with `git grep` | ||
# FIXME comments starting with one of the configured task-tags sometimes are longer than line-length so that you can easily find them with `git grep` | ||
# FIXME(charlie): comments starting with one of the configured task-tags sometimes are longer than line-length so that you can easily find them with `git grep` |
11 changes: 11 additions & 0 deletions
11
crates/ruff_linter/resources/test/fixtures/pycodestyle/E501_3.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# OK (88 characters) | ||
"shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:aaa" # type: ignore | ||
|
||
# OK (88 characters) | ||
"shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:aaa"# type: ignore | ||
|
||
# OK (88 characters) | ||
"shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:aaa" # type: ignore | ||
|
||
# Error (89 characters) | ||
"shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:" + "shape:aaaa" # type: ignore |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
use std::ops::Deref; | ||
|
||
use unicode_width::UnicodeWidthStr; | ||
|
||
use ruff_python_index::Indexer; | ||
use ruff_python_trivia::is_pragma_comment; | ||
use ruff_source_file::Line; | ||
use ruff_text_size::{TextLen, TextRange}; | ||
|
||
use crate::line_width::{LineLength, LineWidthBuilder, TabSize}; | ||
|
||
#[derive(Debug)] | ||
pub(super) struct Overlong { | ||
range: TextRange, | ||
width: usize, | ||
} | ||
|
||
impl Overlong { | ||
/// Returns an [`Overlong`] if the measured line exceeds the configured line length, or `None` | ||
/// otherwise. | ||
pub(super) fn try_from_line( | ||
line: &Line, | ||
indexer: &Indexer, | ||
limit: LineLength, | ||
task_tags: &[String], | ||
tab_size: TabSize, | ||
) -> Option<Self> { | ||
// The maximum width of the line is the number of bytes multiplied by the tab size (the | ||
// worst-case scenario is that the line is all tabs). If the maximum width is less than the | ||
// limit, then the line is not overlong. | ||
let max_width = line.len() * tab_size.as_usize(); | ||
if max_width < limit.value() as usize { | ||
return None; | ||
} | ||
|
||
// Measure the line. If it's already below the limit, exit early. | ||
let width = measure(line.as_str(), tab_size); | ||
if width <= limit { | ||
return None; | ||
} | ||
|
||
// Strip trailing comments and re-measure the line, if needed. | ||
let line = StrippedLine::from_line(line, indexer, task_tags); | ||
let width = match &line { | ||
StrippedLine::WithoutPragma(line) => { | ||
let width = measure(line.as_str(), tab_size); | ||
if width <= limit { | ||
return None; | ||
} | ||
width | ||
} | ||
StrippedLine::Unchanged(_) => width, | ||
}; | ||
|
||
let mut chunks = line.split_whitespace(); | ||
let (Some(_), Some(second_chunk)) = (chunks.next(), chunks.next()) else { | ||
// Single word / no printable chars - no way to make the line shorter. | ||
return None; | ||
}; | ||
|
||
// Do not enforce the line length for lines that end with a URL, as long as the URL | ||
// begins before the limit. | ||
let last_chunk = chunks.last().unwrap_or(second_chunk); | ||
if last_chunk.contains("://") { | ||
if width.get() - last_chunk.width() <= limit.value() as usize { | ||
return None; | ||
} | ||
} | ||
|
||
// Obtain the start offset of the part of the line that exceeds the limit. | ||
let mut start_offset = line.start(); | ||
let mut start_width = LineWidthBuilder::new(tab_size); | ||
for c in line.chars() { | ||
if start_width < limit { | ||
start_offset += c.text_len(); | ||
start_width = start_width.add_char(c); | ||
} else { | ||
break; | ||
} | ||
} | ||
|
||
Some(Self { | ||
range: TextRange::new(start_offset, line.end()), | ||
width: width.get(), | ||
}) | ||
} | ||
|
||
/// Return the range of the overlong portion of the line. | ||
pub(super) const fn range(&self) -> TextRange { | ||
self.range | ||
} | ||
|
||
/// Return the measured width of the line, without any trailing pragma comments. | ||
pub(super) const fn width(&self) -> usize { | ||
self.width | ||
} | ||
} | ||
|
||
/// A [`Line`] that may have trailing pragma comments stripped. | ||
#[derive(Debug)] | ||
enum StrippedLine<'a> { | ||
/// The [`Line`] was unchanged. | ||
Unchanged(&'a Line<'a>), | ||
/// The [`Line`] was changed such that a trailing pragma comment (e.g., `# type: ignore`) was | ||
/// removed. The stored [`Line`] consists of the portion of the original line that precedes the | ||
/// pragma comment. | ||
WithoutPragma(Line<'a>), | ||
} | ||
|
||
impl<'a> StrippedLine<'a> { | ||
/// Strip trailing comments from a [`Line`], if the line ends with a pragma comment (like | ||
/// `# type: ignore`) or, if necessary, a task comment (like `# TODO`). | ||
fn from_line(line: &'a Line<'a>, indexer: &Indexer, task_tags: &[String]) -> Self { | ||
let [comment_range] = indexer.comment_ranges().comments_in_range(line.range()) else { | ||
return Self::Unchanged(line); | ||
}; | ||
|
||
// Convert from absolute to relative range. | ||
let comment_range = comment_range - line.start(); | ||
let comment = &line.as_str()[comment_range]; | ||
|
||
// Ex) `# type: ignore` | ||
if is_pragma_comment(comment) { | ||
// Remove the pragma from the line. | ||
let prefix = &line.as_str()[..usize::from(comment_range.start())].trim_end(); | ||
return Self::WithoutPragma(Line::new(prefix, line.start())); | ||
} | ||
|
||
// Ex) `# TODO(charlie): ...` | ||
if !task_tags.is_empty() { | ||
let Some(trimmed) = comment.strip_prefix('#') else { | ||
return Self::Unchanged(line); | ||
}; | ||
let trimmed = trimmed.trim_start(); | ||
if task_tags | ||
.iter() | ||
.any(|task_tag| trimmed.starts_with(task_tag)) | ||
{ | ||
// Remove the task tag from the line. | ||
let prefix = &line.as_str()[..usize::from(comment_range.start())].trim_end(); | ||
return Self::WithoutPragma(Line::new(prefix, line.start())); | ||
} | ||
} | ||
|
||
Self::Unchanged(line) | ||
} | ||
} | ||
|
||
impl<'a> Deref for StrippedLine<'a> { | ||
type Target = Line<'a>; | ||
|
||
fn deref(&self) -> &Self::Target { | ||
match self { | ||
Self::Unchanged(line) => line, | ||
Self::WithoutPragma(line) => line, | ||
} | ||
} | ||
} | ||
|
||
/// Returns the width of a given string, accounting for the tab size. | ||
fn measure(s: &str, tab_size: TabSize) -> LineWidthBuilder { | ||
let mut width = LineWidthBuilder::new(tab_size); | ||
width = width.add_str(s); | ||
width | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's very conservative (but correct). It means only lines with less than 22 characters skip the check below. This optimisation becomes less effective if we increase the default
tab_size
to 8 (making it only applies to lines with less or equal to 11 characters).Although I'm not sure what we should do about it. Dividing by two could be an option, considering that it's unlikely that every second character is a tab but it will reduce correctness,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was already present, just moved. We used to have a much better heuristic, because we assumed each byte was at most one character, but tabs ruin that -- a single byte can be as large as the tab width, sadly.