Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix #12600] Support Prism as a Ruby parser #12724

Merged
merged 1 commit into from
Mar 1, 2024

Conversation

koic
Copy link
Member

@koic koic commented Feb 29, 2024

Resolves #12600.

This is the initial basic implementation to support multiple parser engines. It is still in an experimental phase. There are still incompatibilities with Prism compared to the Parser gem, but as the RuboCop core, it is possible to begin providing support for Prism independently within the RuboCop core.

Important

To work this feature, the following patch for RuboCop AST needs to be released:
rubocop/rubocop-ast#277

Setting the parser engine

RuboCop allows switching the backend parser by specifying either parser_whitequark or parser_prism for the ParserEngine.

Here are the parsers used as backends for each value:

By default, parser_whitequark is implicitly used.

parser_whitequark can analyze source code from Ruby 2.0 and above:

AllCops:
  ParserEngine: parser_whitequark

parser_prism can analyze source code from Ruby 3.3 and above:

AllCops:
  ParserEngine: parser_prism
  TargetRubyVersion: 3.3

parser_prism tends to perform analysis faster than parser_whitequark.

Caution

Since the use of Prism is experimental, it is not included in RuboCop's runtime dependencies.
If running through Bundler, please first add gem 'prism' to your Gemfile:

gem 'prism'

There are some incompatibilities with parser_whitequark, making it still considered experimental. For known issues, please refer to the Prism issues: https://github.com/ruby/prism/issues?q=is%3Aissue+is%3Aopen+label%3Arubocop

Incompatibility

At the time of opening this PR, the following cops have incompatibility with Prism:

  • Gemspec/RequiredRubyVersion
  • InternalAffairs/LocationLineEqualityComparison
  • Layout/ClassStructure
  • Layout/EmptyLines
  • Layout/EndOfLine
  • Layout/HeredocIndentation
  • Layout/IndentationStyle
  • Layout/LineLength
  • Layout/SpaceAroundKeyword
  • Layout/SpaceAroundOperators
  • Layout/SpaceInsideHashLiteralBraces
  • Layout/TrailingWhitespace
  • Lint/AmbiguousOperator
  • Lint/AmbiguousRegexpLiteral
  • Lint/CircularArgumentReference
  • Lint/DeprecatedClassMethods
  • Lint/DeprecatedConstants
  • Lint/ErbNewArguments
  • Lint/NonDeterministicRequireOrder
  • Lint/NumberedParameterAssignment
  • Lint/ParenthesesAsGroupedExpression
  • Lint/RedundantDirGlobSort
  • Lint/RedundantRequireStatement
  • Lint/RefinementImportMethods
  • Lint/RequireParentheses
  • Lint/Syntax
  • Lint/UnifiedInteger
  • Lint/UselessElseWithoutRescue
  • Naming/BlockForwarding
  • Naming/HeredocDelimiterCase
  • Naming/HeredocDelimiterNaming
  • Naming/VariableNumber
  • Security/YamlLoad
  • Style/ArgumentsForwarding
  • Style/ArrayIntersect
  • Style/BlockDelimiters
  • Style/CollectionCompact
  • Style/CommandLiteral
  • Style/ComparableClamp
  • Style/ConditionalAssignmentAssignInCondition
  • Style/ConditionalAssignmentAssignToCondition
  • Style/DataInheritance
  • Style/DirEmpty
  • Style/FileEmpty
  • Style/FrozenStringLiteralComment
  • Style/GuardClause
  • Style/HashExcept
  • Style/HashSyntax
  • Style/HashTransformKeys
  • Style/HashTransformValues
  • Style/IfWithBooleanLiteralBranches
  • Style/MultilineTernaryOperator
  • Style/MultilineWhenThen
  • Style/MutableConstant
  • Style/NestedFileDirname
  • Style/NumericLiterals
  • Style/NumericPredicate
  • Style/ObjectThen
  • Style/QuotedSymbols
  • Style/RedundantBegin
  • Style/RedundantFreeze
  • Style/RedundantHeredocDelimiterQuotes
  • Style/RedundantParentheses
  • Style/RescueModifier
  • Style/SafeNavigation
  • Style/SelectByRegexp
  • Style/SingleLineMethods
  • Style/SlicingWithRange
  • Style/StringLiterals
  • Style/TernaryParentheses
  • Style/YamlFileRead
  • Style/YodaCondition

Some cop incompatibilities have been resolved in the Prism development line.

For known issues, please refer to the Prism issues: https://github.com/ruby/prism/issues?q=is%3Aissue+is%3Aopen+label%3Arubocop

Lint/Syntax cop

The messages generated by Lint/Syntax depend on the parser engine used.

Parser gem:

$ ruby -rparser/ruby33 -ve 'p Parser::Ruby33.parse("(")'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
(string):1:2: error: unexpected token $end

Displays unexpected token $end.

Prism:

$ ruby -rprism -rprism/translation/parser33 -ve 'p Prism::Translation::Parser33.parse("(")'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
(string):1:2: error: expected a matching `)`

Displays expected a matching ).

There are differences in the messages between Parser gem and Prism, but since Prism can provide clearer messages in some cases, this incompatibility is accepted. In other words, the messages may vary depending on the parser engine.

Test

To run tests with Prism, the command bundle exec rake prism_spec is provided. This task is also executed in GitHub Actions.

To run tests with Prism specifying files, set the environment variable PARSER_ENGINE=parser_prism:

$ PARSER_ENGINE=parser_prism path/to/test_spec.rb

In the context of testing with Prism, two options for test cases are provided: broken_on: :prism and unsupported_on: :prism.
Both options are utilized to skip tests specifically for Prism.

broken_on: :prism

Test cases failing due to Prism incompatibilities are marked with broken_on: :prism. This indicates an expectation for the issue to be resolved within Prism.

unsupported_on: :prism

Prism is designed to parse Ruby versions 3.3+, which means that features unique to Ruby versions 2.0 through 3.2 are not supported.
Test cases falling into this category are marked with unsupported_on: :prism. This marker is used for cases that are testable with the Parser gem but not with Prism.

Note

With bundle exec rake, prism_spec will be run instead of ascii_spec. The ascii_spec task has not been failing for a while, so it will not be run by default. However, ascii_spec task will continue to be checked in CI. If there are any failures originating from ascii_spec in CI, please run bundle exec ascii_spec to investigate.


Before submitting the PR make sure the following are checked:

  • The PR relates to only one subject with a clear title and description in grammatically correct, complete sentences.
  • Wrote good commit messages.
  • Commit message starts with [Fix #issue-number] (if the related issue exists).
  • Feature branch is up-to-date with master (if not - rebase it).
  • Squashed related commits together.
  • Added tests.
  • Ran bundle exec rake default. It executes all tests and runs RuboCop on its own code.
  • Added an entry (file) to the changelog folder named {change_type}_{change_description}.md if the new code introduces user-observable changes. See changelog entry format for details.

@bbatsov
Copy link
Collaborator

bbatsov commented Feb 29, 2024

@koic I've just cut rubocop-ast 1.31, so you can proceed with this PR. I also cut RuboCop 1.61, so we can have a separate release focused mostly on Prism support.

@bbatsov
Copy link
Collaborator

bbatsov commented Feb 29, 2024

Overall the PR looks good to me and I've got no major feedback on your approach. Great work!

@koic koic force-pushed the support_prism branch 3 times, most recently from d022eea to fc29296 Compare February 29, 2024 12:55
@@ -24,7 +24,10 @@ Dir['tasks/**/*.rake'].each { |t| load t }
desc 'Run RuboCop over itself'
RuboCop::RakeTask.new(:internal_investigation)

task default: %i[documentation_syntax_check spec ascii_spec internal_investigation]
# The `ascii_spec` task has not been failing for a while, so it will not be run by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a typo here?

Suggested change
# The `ascii_spec` task has not been failing for a while, so it will not be run by default.
# The `ascii_spec` task has been failing for a while, so it will not be run by default.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the mean here is that it's a legacy task that's no longer relevant. Probably we don't need it at all at this point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is not a typo.


if yaml.dig('AllCops', 'ParserEngine') == 'parser_prism'
require 'prism'
"Prism #{Prism::VERSION}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these reflect the "Parser (Whitequark)", "Parser (Prism)", and possible future "Prism" naming, to match the ParserEngine config?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser translation is now bundled with Prism, so I don't think mentioning it separately makes much sense. The multiple parser engines are something temporary - this will disappear once we switch to Prism's native AST format, so I wouldn't fret too much over the naming here. Even if we're stuck with it for a while, it's very unlikely we'll support more than a couple of parser engines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Keeping the current representation is simpler and probably better.

@bbatsov
Copy link
Collaborator

bbatsov commented Mar 1, 2024

@koic I see the CI is green. Should we merge this and cut a quick release today or leave this for early next week?

@koic
Copy link
Member Author

koic commented Mar 1, 2024

@bbatsov Either option is fine. There might be points that could be improved upon in the future, but the basic implementation is complete.

One concern is whether the runtime dependency on Prism should be removed, similar to issue rubocop/rubocop-ast#282. This might potentially become a blocker for upgrades, just as with RuboCop AST.

The plan is to remove Prism from the rubocop.gemspec's runtime dependencies for the time being. Are there any concerns regarding this?

Currently, there is still the runtime dependency on Prism.

Resolves rubocop#12600.

This is the initial basic implementation to support multiple parser engines.
It is still in an experimental phase. There are still incompatibilities with Prism
compared to the Parser gem, but as the RuboCop core, it is possible to begin providing support
for Prism independently within the RuboCop core.

> [!IMPORTANT]
> To work this feature, the following patch for RuboCop AST needs to be released:
> rubocop/rubocop-ast#277

## Setting the parser engine

RuboCop allows switching the backend parser by specifying either
`parser_whitequark` or `parser_prism` for the `ParserEngine`.

Here are the parsers used as backends for each value:

- `ParserEngine: parser_whitequark` ... https://github.com/whitequark/parser
- `ParserEngine: parser_prism` ... https://github.com/ruby/prism (`Prism::Translation::Parser`)

By default, `parser_whitequark` is implicitly used.

`parser_whitequark` can analyze source code from Ruby 2.0 and above:

```yaml
AllCops:
  ParserEngine: parser_whitequark
```

`parser_prism` can analyze source code from Ruby 3.3 and above:

```yaml
AllCops:
  ParserEngine: parser_prism
  TargetRubyVersion: 3.3
```

`parser_prism` tends to perform analysis faster than `parser_whitequark`.

> [!CAUTION]
> Since the use of Prism is experimental, it is not included in RuboCop's runtime dependencies.
> If running through Bundler, please first add `gem 'prism'` to your Gemfile:
>
> ```ruby
> gem 'prism'
> ```

There are some incompatibilities with `parser_whitequark`, making it still considered experimental.
For known issues, please refer to the Prism issues:
https://github.com/ruby/prism/issues?q=is%3Aissue+is%3Aopen+label%3Arubocop

## Incompatibility

At the time of opening this PR, the following cops have incompatibility with Prism:

- `Gemspec/RequiredRubyVersion`
- `InternalAffairs/LocationLineEqualityComparison`
- `Layout/ClassStructure`
- `Layout/EmptyLines`
- `Layout/EndOfLine`
- `Layout/HeredocIndentation`
- `Layout/IndentationStyle`
- `Layout/LineLength`
- `Layout/SpaceAroundKeyword`
- `Layout/SpaceAroundOperators`
- `Layout/SpaceInsideHashLiteralBraces`
- `Layout/TrailingWhitespace`
- `Lint/AmbiguousOperator`
- `Lint/AmbiguousRegexpLiteral`
- `Lint/CircularArgumentReference`
- `Lint/DeprecatedClassMethods`
- `Lint/DeprecatedConstants`
- `Lint/ErbNewArguments`
- `Lint/NonDeterministicRequireOrder`
- `Lint/NumberedParameterAssignment`
- `Lint/ParenthesesAsGroupedExpression`
- `Lint/RedundantDirGlobSort`
- `Lint/RedundantRequireStatement`
- `Lint/RefinementImportMethods`
- `Lint/RequireParentheses`
- `Lint/Syntax`
- `Lint/UnifiedInteger`
- `Lint/UselessElseWithoutRescue`
- `Naming/BlockForwarding`
- `Naming/HeredocDelimiterCase`
- `Naming/HeredocDelimiterNaming`
- `Naming/VariableNumber`
- `Security/YamlLoad`
- `Style/ArgumentsForwarding`
- `Style/ArrayIntersect`
- `Style/BlockDelimiters`
- `Style/CollectionCompact`
- `Style/CommandLiteral`
- `Style/ComparableClamp`
- `Style/ConditionalAssignmentAssignInCondition`
- `Style/ConditionalAssignmentAssignToCondition`
- `Style/DataInheritance`
- `Style/DirEmpty`
- `Style/FileEmpty`
- `Style/FrozenStringLiteralComment`
- `Style/GuardClause`
- `Style/HashExcept`
- `Style/HashSyntax`
- `Style/HashTransformKeys`
- `Style/HashTransformValues`
- `Style/IfWithBooleanLiteralBranches`
- `Style/MultilineTernaryOperator`
- `Style/MultilineWhenThen`
- `Style/MutableConstant`
- `Style/NestedFileDirname`
- `Style/NumericLiterals`
- `Style/NumericPredicate`
- `Style/ObjectThen`
- `Style/QuotedSymbols`
- `Style/RedundantBegin`
- `Style/RedundantFreeze`
- `Style/RedundantHeredocDelimiterQuotes`
- `Style/RedundantParentheses`
- `Style/RescueModifier`
- `Style/SafeNavigation`
- `Style/SelectByRegexp`
- `Style/SingleLineMethods`
- `Style/SlicingWithRange`
- `Style/StringLiterals`
- `Style/TernaryParentheses`
- `Style/YamlFileRead`
- `Style/YodaCondition`

Some cop incompatibilities have been resolved in the Prism development line.

For known issues, please refer to the Prism issues:
https://github.com/ruby/prism/issues?q=is%3Aissue+is%3Aopen+label%3Arubocop

### `Lint/Syntax` cop

The messages generated by Lint/Syntax depend on the parser engine used.

Parser gem:

```console
$ ruby -rparser/ruby33 -ve 'p Parser::Ruby33.parse("(")'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
(string):1:2: error: unexpected token $end
```

Displays `unexpected token $end`.

Prism:

```console
$ ruby -rprism -rprism/translation/parser33 -ve 'p Prism::Translation::Parser33.parse("(")'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
(string):1:2: error: expected a matching `)`
```

Displays `expected a matching )`.

There are differences in the messages between Parser gem and Prism,
but since Prism can provide clearer messages in some cases, this incompatibility is accepted.
In other words, the messages may vary depending on the parser engine.

## Test

To run tests with Prism, the command `bundle exec rake prism_spec` is provided.
This task is also executed in GitHub Actions.

To run tests with Prism specifying files, set the environment variable `PARSER_ENGINE=parser_prism`:

```console
$ PARSER_ENGINE=parser_prism path/to/test_spec.rb
```

In the context of testing with Prism, two options for test cases are provided:
`broken_on: :prism` and `unsupported_on: :prism`.
Both options are utilized to skip tests specifically for Prism.

### `broken_on: :prism`

Test cases failing due to Prism incompatibilities are marked with `broken_on: :prism`.
This indicates an expectation for the issue to be resolved within Prism.

### `unsupported_on: :prism`

Prism is designed to parse Ruby versions 3.3+, which means that features unique to
Ruby versions 2.0 through 3.2 are not supported.
Test cases falling into this category are marked with `unsupported_on: :prism`.
This marker is used for cases that are testable with the Parser gem but not with Prism.

> [!NOTE]
> With `bundle exec rake`, `prism_spec` will be run instead of `ascii_spec`.
> The `ascii_spec` task has not been failing for a while, so it will not be run by default.
> However, `ascii_spec` task will continue to be checked in CI. If there are any failures
> originating from `ascii_spec` in CI, please run `bundle exec ascii_spec` to investigate.
@koic
Copy link
Member Author

koic commented Mar 1, 2024

For now, the runtime dependency on Prism has been removed from RuboCop, and the documentation has been updated to instruct on separate installation. This completes the current phase.

@bbatsov
Copy link
Collaborator

bbatsov commented Mar 1, 2024

I think that's the best course of action for now. @kddnewton Please update at some point the Prism docs for using it with RuboCop as well to reflect the recent changes. I probably cut the next RuboCop release early next week.

@bbatsov bbatsov merged commit c123b96 into rubocop:master Mar 1, 2024
34 checks passed
@koic koic deleted the support_prism branch March 1, 2024 18:48
@tisba
Copy link

tisba commented Mar 1, 2024

This is super awesome, thank you for all your work! 🙏

I gave c123b96 a spin against our Rails monolith and it works great with only Rails/UniqueValidationWithoutIndex producing an error (uninitialized constant Parser::Ruby33). That's from the rubocop-rails gem and I'm aware that this cannot be ready yet. Do you keep track of that already somewhere, or do you want me to file an issue? Would like to help if I can!

@koic
Copy link
Member Author

koic commented Mar 2, 2024

@tisba That needed to be resolved early on. I've opened rubocop/rubocop-rails#1245. Thank you for the early feedback!

@eregon
Copy link

eregon commented Mar 2, 2024

I was curious of the performance, so I tried this running rubocop on the whole ruby/spec (ruby/spec@408e230), running on CRuby 3.3.0.
To install the gem since it's not released yet:

cd rubocop
bundle
bundle exec rake build
gem i pkg/rubocop-*.gem

With this diff in ruby/spec:

diff --git a/.rubocop.yml b/.rubocop.yml
index be32ce8900..5cf4bbf1b0 100644
--- a/.rubocop.yml
+++ b/.rubocop.yml
@@ -1,7 +1,8 @@
 inherit_from: .rubocop_todo.yml
 
 AllCops:
-  TargetRubyVersion: 3.0
+  TargetRubyVersion: 3.3
+  ParserEngine: parser_prism # also used commented out for the baseline
   DisplayCopNames: true
   Exclude:
     - command_line/fixtures/bad_syntax.rb

There are some expected offenses since I'm changing TargetRubyVersion to 3.3, e.g. unnecessary require 'set'.

I used time rubocop --cache=false . followed by multiple time rubocop ., not sure if there is a better way to measure this?
From the timings it seems --cache=false disables parallel processing, and --no-parallel disables the cache.

baseline: TargetRubyVersion: 3.3, no ParserEngine set

$ time rubocop --cache=false .
4546 files inspected, 69 offenses detected, 68 offenses autocorrectable
rubocop --cache=false .  42.23s user 0.33s system 99% cpu 42.777 total

$ time rubocop --cache=false .
4546 files inspected, 69 offenses detected, 68 offenses autocorrectable
rubocop --cache=false .  42.23s user 0.34s system 99% cpu 42.794 total

$ time rubocop .
rubocop .  81.17s user 2.58s system 1077% cpu 7.774 total
$ time rubocop .
rubocop .  4.28s user 1.46s system 206% cpu 2.776 total
$ time rubocop .
rubocop .  4.25s user 1.51s system 209% cpu 2.753 total

Using the Prism translation layer: TargetRubyVersion: 3.3, ParserEngine: parser_prism:

$ time rubocop --cache=false .
4546 files inspected, 72 offenses detected, 68 offenses autocorrectable
rubocop --cache=false .  31.57s user 0.47s system 99% cpu 32.266 total

$ time rubocop --cache=false .
4546 files inspected, 72 offenses detected, 68 offenses autocorrectable
rubocop --cache=false .  31.50s user 0.40s system 99% cpu 32.069 total

$ time rubocop .
rubocop .  64.94s user 2.85s system 1011% cpu 6.702 total
$ time rubocop .
rubocop .  4.24s user 1.50s system 207% cpu 2.771 total
$ time rubocop .
rubocop .  4.20s user 1.49s system 204% cpu 2.776 total

So around a 25% speedup in that case for --cache=false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it easier to use RuboCop with Prism
5 participants