Benchmark all rules #3570

MichaReiser · 2023-03-17T07:07:58Z

Summary

This PR adds the new benchmark group linter/all-rules (and renames the existing group to linter/default-rules).

The motivation of benchmarking all rules is new rules are not part of the default-set and are, thus, not benchmarked. This can result in us missing a new rule that regresses the performance for all users enabling it.

Considerations

Why not change the existing benchmark to run all rules: The default rules benchmark allows us to track the performance of Ruff's infrastructure better. The cost of our infrastructure (scope analysis, traversing the tree) is neglectable when running all rules but is more significant when only running some rules. At least now, the cost of running some more benchmarks is "cheap" because the CI job spends most time building the benchmark.

github-actions · 2023-03-17T07:27:06Z

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
linter/all-rules/large/dataset.py                                                 1.00     14.5±0.06ms     2.8 MB/sec
linter/all-rules/numpy/ctypeslib.py                                               1.00      3.8±0.02ms     4.3 MB/sec
linter/all-rules/numpy/globals.py                                                 1.00    436.3±1.57µs     6.8 MB/sec
linter/all-rules/pydantic/types.py                                                1.00      6.4±0.01ms     4.0 MB/sec
linter/default-rules/large/dataset.py                                             1.00      8.2±0.01ms     5.0 MB/sec
linter/default-rules/numpy/ctypeslib.py                                           1.00   1781.3±3.16µs     9.3 MB/sec
linter/default-rules/numpy/globals.py                                             1.00    186.3±0.56µs    15.8 MB/sec
linter/default-rules/pydantic/types.py                                            1.00      3.8±0.01ms     6.7 MB/sec
linter/large/dataset.py                    1.00      8.3±0.01ms     4.9 MB/sec  
linter/numpy/ctypeslib.py                  1.00      2.2±0.01ms   156.8 MB/sec  
linter/numpy/globals.py                    1.00  1146.0±10.33µs   155.5 MB/sec  
linter/pydantic/types.py                   1.00      3.9±0.02ms     6.6 MB/sec

Windows

group                                      main                                    pr
-----                                      ----                                    --
linter/all-rules/large/dataset.py                                                  1.00     20.1±1.17ms     2.0 MB/sec
linter/all-rules/numpy/ctypeslib.py                                                1.00      5.3±0.21ms     3.1 MB/sec
linter/all-rules/numpy/globals.py                                                  1.00   687.8±45.59µs     4.3 MB/sec
linter/all-rules/pydantic/types.py                                                 1.00      9.0±0.41ms     2.8 MB/sec
linter/default-rules/large/dataset.py                                              1.00     11.5±0.41ms     3.5 MB/sec
linter/default-rules/numpy/ctypeslib.py                                            1.00      2.4±0.10ms     7.0 MB/sec
linter/default-rules/numpy/globals.py                                              1.00   295.0±19.57µs    10.0 MB/sec
linter/default-rules/pydantic/types.py                                             1.00      5.2±0.31ms     4.9 MB/sec
linter/large/dataset.py                    1.00     12.4±0.68ms     3.3 MB/sec   
linter/numpy/ctypeslib.py                  1.00      2.7±0.13ms   126.4 MB/sec   
linter/numpy/globals.py                    1.00  1441.7±116.37µs   123.6 MB/sec  
linter/pydantic/types.py                   1.00      5.4±0.27ms     4.7 MB/sec

charliermarsh

Since we're doubling the number of output rows, should we consider reducing the number of files that are included in the benchmark, just to keep it information-dense? (In other words, how much marginal benefit is there right now for each of the four files we're analyzing? Are any of them redundant?)

MichaReiser · 2023-03-17T17:30:23Z

The output will have 8 rows after merging (the last 4 are only because the benchmark names between main and this branch don't match).

Since we're doubling the number of output rows, should we consider reducing the number of files that are included in the benchmark, just to keep it information-dense? (In other words, how much marginal benefit is there right now for each of the four files we're analyzing? Are any of them redundant?)

We could. I didn't spend much time picking the files but my thinking was:

a small file -> sensitive to changes that increase the infrastructure overhead: numpy/globals
two medium files -> to represent the average case: [pydantic/types, nmpy/ctypeslib.py]
a large file -> sensitive to rules with O(n^2) or worse complexity: [large/dataset.py]
a file with many type annotations pydantic/types

We could potentially remove numpy/ctypeslib.py because it is a medium file but I think it's good worth keeping it because the large file has barely any comments.

MichaReiser · 2023-03-17T17:48:13Z

Current dependencies on/for this PR:

main
- PR Benchmark all rules #3570 👈

This comment was auto-generated by Graphite.

MichaReiser · 2023-03-17T17:49:35Z

crates/ruff_benchmark/benches/linter.rs

        )?),
-        TestCase::normal(TestFile::try_download("numpy/ctypeslib.py", "https://github.com/numpy/numpy/blob/main/numpy/ctypeslib.py")?),
+        TestCase::normal(TestFile::try_download("numpy/ctypeslib.py", "https://raw.githubusercontent.com/numpy/numpy/e42c9503a14d66adfd41356ef5640c6975c45218/numpy/ctypeslib.py")?),


whoops... I never verified if it downloads the correct files. The non-raw endpoints return HTML and not python 🤭

MichaReiser · 2023-03-17T17:50:14Z

crates/ruff_benchmark/src/lib.rs

@@ -68,6 +69,28 @@ pub struct TestFile {
    code: String,
 }

+static TARGET_DIR: once_cell::sync::Lazy<PathBuf> = once_cell::sync::Lazy::new(|| {


This fixes an issue where the benchmarks created a target folder in the ruff_benchmark directory instead of re-using Cargo's target directory (copied from criterion)

MichaReiser force-pushed the benchmark-all-rules branch from ebb6b0f to 6cd4eea Compare March 17, 2023 07:12

MichaReiser marked this pull request as ready for review March 17, 2023 07:12

MichaReiser force-pushed the benchmark-all-rules branch from 6cd4eea to 3251a52 Compare March 17, 2023 07:17

MichaReiser changed the title ~~benchmarks: Benchmark all rules~~ Benchmark all rules Mar 17, 2023

MichaReiser requested a review from charliermarsh March 17, 2023 07:51

charliermarsh approved these changes Mar 17, 2023

View reviewed changes

benchmarks: Benchmark all rules

141533a

MichaReiser force-pushed the benchmark-all-rules branch from 3251a52 to 699b5de Compare March 17, 2023 17:48

MichaReiser commented Mar 17, 2023

View reviewed changes

MichaReiser force-pushed the benchmark-all-rules branch from 699b5de to a05d519 Compare March 17, 2023 17:53

Fix test case urls (must use raw) and target directory resolution

177f2eb

MichaReiser force-pushed the benchmark-all-rules branch from a05d519 to 177f2eb Compare March 17, 2023 18:06

MichaReiser merged commit 87fab4a into main Mar 17, 2023
12 checks passed

MichaReiser deleted the benchmark-all-rules branch March 17, 2023 18:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark all rules #3570

Benchmark all rules #3570

MichaReiser commented Mar 17, 2023 •

edited

github-actions bot commented Mar 17, 2023 •

edited

charliermarsh left a comment

MichaReiser commented Mar 17, 2023

MichaReiser commented Mar 17, 2023

MichaReiser Mar 17, 2023

MichaReiser Mar 17, 2023

Benchmark all rules #3570

Benchmark all rules #3570

Conversation

MichaReiser commented Mar 17, 2023 • edited

Summary

Considerations

github-actions bot commented Mar 17, 2023 • edited

PR Check Results

Ecosystem

Benchmark

Linux

Windows

charliermarsh left a comment

Choose a reason for hiding this comment

MichaReiser commented Mar 17, 2023

MichaReiser commented Mar 17, 2023

MichaReiser Mar 17, 2023

Choose a reason for hiding this comment

MichaReiser Mar 17, 2023

Choose a reason for hiding this comment

MichaReiser commented Mar 17, 2023 •

edited

github-actions bot commented Mar 17, 2023 •

edited