Add `Metrics/CollectionLiteralLength` cop #11584

sambostock · 2023-02-16T23:01:33Z

This cop checks for extremely long Array and Hash literals, and Set pseudo-literals (actually parameters to the Set.[] method).

Before submitting the PR make sure the following are checked:

The PR relates to only one subject with a clear title and description in grammatically correct, complete sentences.
Wrote good commit messages.
~~Commit message starts with [Fix #issue-number] (if the related issue exists).~~
Feature branch is up-to-date with master (if not - rebase it).
Squashed related commits together.
Added tests.
Ran bundle exec rake default. It executes all tests and runs RuboCop on its own code.
Added an entry (file) to the changelog folder named {change_type}_{change_description}.md if the new code introduces user-observable changes. See changelog entry format for details.

sambostock · 2023-02-16T23:29:35Z

config/default.yml

+  Description: 'Checks for extremely large literals, such as `Array`s and `Hash`es with many entries.'
+  Enabled: pending
+  VersionAdded: '<<next>>'
+  LengthThreshold: 250


I picked this arbitrarily. 🤷

We shouldn't pick a number so high that it makes the rule useless, but also shouldn't be too aggressive in our default. Application owners are free to adjust the number as they see fit.

koic · 2023-02-17T05:18:25Z

I can understand that LengthThreshold is set to a sufficient value, but I'm not sure it's of general interest to catch cases like this.
For example, hash parameters may be used in Ruby DSL instead of YAML. In this case the number doesn't matter.
As a result I'm not sure whether it's useful to have this cop in the core by default.

sambostock · 2023-02-17T06:08:54Z

This is actually related to #11588 which I just opened.

An application of ours had an array of 2000+ sub-arrays of reconciliation data to be inserted in a backfill rake task. A O(n²) loop in Style/WordArray was making it take hours to investigate that file, which sent me down this rabbit hole (and fixing it in #11588).

I created this cop because I have seen many instances of this over the years, where otherwise simple files are crowded by large data/config sections that would be better extracted into separate files (e.g. CSV), or better yet, stored externally from the git repo all together, and uploaded to the app for one time use (via an approach like a Maintenance Tasks' CSV task).

Another occurrence I saw recently was a project where they'd tried to load a CSV, but it was taking too long (simple mistake, see rubocop/rubocop-performance#328 which I opened to catch it), so they decided to hard code a Hash of hundreds of thousands of entries as a constant in an otherwise small file. This cop would have flagged this, and hopefully the author would have investigated.

It is my opinion that in most projects very large literals would be a smell (perhaps 500, or 1,000 would be better thresholds), and that projects that use a DSL where this makes sense would simply disable the cop.

spec/rubocop/cop/style/lengthy_literal_spec.rb

This cop checks for extremely long `Array` and `Hash` literals, and `Set` pseudo-literals (actually parameters to the `Set.[]` method).

bbatsov · 2023-02-18T07:31:59Z

I like the overall idea, although I'm not sure whether this is something we can have enabled by default. A couple of remarks:

Given that this targets only collection literals I'd go for a name that's more specific
The default seems pretty big and I'm curious how you came up with it (I saw in one remark this was done arbitrary, so if that's really the case we can discuss options here)

sambostock · 2023-02-18T14:47:24Z

I'm not sure whether this is something we can have enabled by default.

Fair. We'll probably enable it in rubocop-shopify, as others can too.

Given that this targets only collection literals I'd go for a name that's more specific

Yeah, good point. I'd originally chosen a generic name because I wanted to also include strings, but as I started to look into it it wasn't as simple. Perhaps LengthyCollectionLiteral? I originally had Huge and want sure if that sounded okay. Long could also work.

The default seems pretty big and I'm curious how you came up with it (I saw in one remark this was done arbitrary, so if that's really the case we can discuss options here)

Yeah, I just picked a number that felt

Big enough to not be contentious
- Something too small, like 10, could appear when doing things like building API payloads, so would have too many false positives.
Small enough to actually be triggered by real usage
I think we could all agree a literal of length 999_999_999 doesn't belong in a Ruby file, but that's not a very useful threshold.

So not fully arbitrary, but not based on any metrics or anything like that.

bbatsov · 2023-02-23T09:16:12Z

Perhaps LengthyCollectionLiteral?

I'd be fine by me. @rubocop/rubocop-core Proposals for a better name would be welcome.

Btw, it's also debatable if this is a Style cop or a Metrics cop, given it's similar in spirit to method/class length checks. If it's there it can be simple Metrics/CollectionLength or something along those lines.

koic · 2023-02-24T09:41:26Z

Btw, it's also debatable if this is a Style cop or a Metrics cop, given it's similar in spirit to method/class length checks. If it's there it can be simple Metrics/CollectionLength or something along those lines.

Interesting. Metrics may be better.

spec/rubocop/cop/style/lengthy_literal_spec.rb

sambostock · 2023-02-24T16:00:44Z

@bbatsov, @koic what do you think of Metrics/LongCollectionLiteral?

I think it's important to have Literal in there, because the target isn't large collections in general, it's large collection literals that shouldn't be inline in a class/module, and instead extracted to config files or something (even if the config file is a Ruby file using a DSL which is excluded from this cop).

bbatsov · 2023-02-24T21:21:10Z

@bbatsov, @koic what do you think of Metrics/LongCollectionLiteral?

I'm fine with that name. CollectionLiteralLength would be fine as well IMO. (and more in line with current naming)

sambostock · 2023-02-24T22:20:57Z

CollectionLiteralLength would be fine as well IMO. (and more in line with current naming)

Ah yes, I agree that is better. I'll make the change.

This test appears to disable all departments except `Style` for the purposes of the test. However, it missed `Metrics`, which means `pending` `Metrics` tests result in unexpected output.

bbatsov · 2023-02-26T07:02:27Z

Thanks!

sambostock force-pushed the lengthy-literal branch 6 times, most recently from 82bd9d5 to ab1c871 Compare February 16, 2023 23:26

sambostock commented Feb 16, 2023

View reviewed changes

jaynetics reviewed Feb 17, 2023

View reviewed changes

spec/rubocop/cop/style/lengthy_literal_spec.rb Outdated Show resolved Hide resolved

Add Style/LengthyLiteral cop

a9a16ce

This cop checks for extremely long `Array` and `Hash` literals, and `Set` pseudo-literals (actually parameters to the `Set.[]` method).

sambostock force-pushed the lengthy-literal branch from 55c450e to a9a16ce Compare February 17, 2023 18:30

sambostock requested a review from jaynetics February 17, 2023 18:30

koic reviewed Feb 24, 2023

View reviewed changes

spec/rubocop/cop/style/lengthy_literal_spec.rb Outdated Show resolved Hide resolved

koic reviewed Feb 24, 2023

View reviewed changes

spec/rubocop/cop/style/lengthy_literal_spec.rb Outdated Show resolved Hide resolved

koic reviewed Feb 24, 2023

View reviewed changes

spec/rubocop/cop/style/lengthy_literal_spec.rb Outdated Show resolved Hide resolved

sambostock added 2 commits February 24, 2023 10:53

Remove 'require "set"'

850118c

Use let instead of private methods

558ba2e

sambostock added 2 commits February 25, 2023 22:49

Rename Style/LengthyLiteral to Metrics/CollectionLiteralLength

c13345f

Add missing Metrics department to test config

b56c63d

This test appears to disable all departments except `Style` for the purposes of the test. However, it missed `Metrics`, which means `pending` `Metrics` tests result in unexpected output.

bbatsov merged commit e4f7799 into rubocop:master Feb 26, 2023

sambostock deleted the lengthy-literal branch February 26, 2023 14:44

sambostock changed the title ~~Add Style/LengthyLiteral cop~~ Add Metrics/CollectionLiteralLength cop Mar 6, 2023

sambostock mentioned this pull request Mar 7, 2023

Bump rubocop from 1.45.1 to 1.48.0 Shopify/ruby-style-guide#510

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `Metrics/CollectionLiteralLength` cop #11584

Add `Metrics/CollectionLiteralLength` cop #11584

sambostock commented Feb 16, 2023 •

edited

sambostock Feb 16, 2023

koic commented Feb 17, 2023

sambostock commented Feb 17, 2023 •

edited

bbatsov commented Feb 18, 2023

sambostock commented Feb 18, 2023

bbatsov commented Feb 23, 2023 •

edited

koic commented Feb 24, 2023

sambostock commented Feb 24, 2023

bbatsov commented Feb 24, 2023 •

edited

sambostock commented Feb 24, 2023

bbatsov commented Feb 26, 2023

Add Metrics/CollectionLiteralLength cop #11584

Add Metrics/CollectionLiteralLength cop #11584

Conversation

sambostock commented Feb 16, 2023 • edited

sambostock Feb 16, 2023

Choose a reason for hiding this comment

koic commented Feb 17, 2023

sambostock commented Feb 17, 2023 • edited

bbatsov commented Feb 18, 2023

sambostock commented Feb 18, 2023

bbatsov commented Feb 23, 2023 • edited

koic commented Feb 24, 2023

sambostock commented Feb 24, 2023

bbatsov commented Feb 24, 2023 • edited

sambostock commented Feb 24, 2023

bbatsov commented Feb 26, 2023

Add `Metrics/CollectionLiteralLength` cop #11584

Add `Metrics/CollectionLiteralLength` cop #11584

sambostock commented Feb 16, 2023 •

edited

sambostock commented Feb 17, 2023 •

edited

bbatsov commented Feb 23, 2023 •

edited

bbatsov commented Feb 24, 2023 •

edited