-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: update Unicode letter detection in capitalized-comments rule #18375
Conversation
✅ Deploy Preview for docs-eslint canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Would like another review before merging.
A couple of things:
I think overall, I'm not sure of the utility of this change. I suppose it's technically a bug, but pragmatically, I'm not sure how this change benefits users. That said, if people feel strongly that this should be fixed, we can merge this. Let's just please be careful about spending time on rules that are just enforcing stylistic preferences. |
The main purpose was to remove lib/rules/utils/patterns/letters.js, which has the character of legacy internationalization code, and is now covered by built-in language functionality. The remaining changes are to ensure that the affected rule is still working as intended after the Unicode upgrade. This is in fact a change request in form of a PR, so it's fine to close it if there is no consensus that it should be merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
I'm fine with this change because it replaces a manual and now outdated check with a built-in language feature.
As we have 3 people in favor, and @nzakas is not opposed to merging this, I'm going to merge it now. |
Prerequisites checklist
What is the purpose of this pull request? (put an "X" next to an item)
[ ] Documentation update
[ ] Bug fix (template)
[ ] New rule (template)
[X] Changes an existing rule (template)
[ ] Add autofix to a rule
[ ] Add a CLI option
[ ] Add something to the core
[ ] Other, please explain:
What rule do you want to change?
capitalized-comments
What change do you want to make (place an "X" next to just one item)?
[X] Generate more warnings
[ ] Generate fewer warnings
[ ] Implement autofix
[ ] Implement suggestions
How will the change be implemented (place an "X" next to just one item)?
[ ] A new option
[X] A new default behavior
[ ] Other
Please provide some example code that this change will affect:
What does the rule currently do for this code?
Reports no errors.
What will the rule do after it's changed?
It will report the line comments in both examples as errors. All line comments use wrong capitalization according to current versions of Unicode.
What changes did you make? (Give an overview)
This PR updates the capitalized-comments rule to recognize characters that are considered letters in current versions of Unicode. Currently, the rule only recognizes letters that were such in Unicode 7.0.0, and only if composed by a single code unit (up to code point U+FFFF).
\p{L}
(see compatibility) to match all letters in the version of Unicode of supported by the running engine.Compared to the current regex, the new one will match 698 more letters that have uppercase or lowercase variations. These are the characters that could report more problems with the capitalized-comments rule.
Analysis details
The following script can be placed in the eslint repo directory with the main branch checked out to find characters that are considered letters:
This is the output:
Is there anything you'd like reviewers to focus on?
JavaScript engines are routinely updated to implement the latest version of Unicode. If the Unicode specification changes, there is a chance that the Unicode character class escape
\p{L}
may start to work differently in new version of Node.js. This means that the rule's behavior may change depending on the version of Node.js used by the user. The advantage is that the regex will never become outdated.We have the same situation for other Unicode features in our code, like the
\p{Mc}
escape, or forIntl.Segmenter
.