ci: Benchmark CI Step #3480

MichaReiser · 2023-03-13T12:51:42Z

This PR adds a new benchmark step that runs for every pull request and compares the PR performance against main. The action writes a comment (or updates an existing one) with the results.

This is inspired and partially based on Rome's and OXC's benchmarking setup that @Boshen created.

Test Plan

See this comment

MichaReiser · 2023-03-14T08:56:29Z

@Boshen It's not fully functional yet nor have I tested the caching but I think you could find some of it interesting ;)

MichaReiser · 2023-03-14T09:57:18Z

Okay, caching and running PR/main on different runners is a bad idea. The runtime difference is too big even if there are no changes.

sciyoshi · 2023-03-15T01:58:23Z

This is great! Two quick comments since I was working recently on the ecosystem CI check which also comments on the PR:

It would probably make sense to combine the commenting piece of this workflow with ecosystem-comment.yaml (and probably rename it to e.g. pr-comment.yaml). That way there will be a single comment rather than two which would probably make for less noise, and also the separate workflow means that the PR comment can also work for PRs from forks of this repo (which is basically every PR except yours and @charliermarsh's 😄)
I'm not sure what artifacts cargo bench requires, but assuming it only needs the debug binary, you could use the artifacts that are now uploaded by the cargo test step. That would save on the re-build time (at least for the main branch), and means that the benchmark only runs if the remaining tests pass.

Boshen · 2023-03-15T02:40:48Z

Very clever approaches for speeding up CI times! Thank you @MichaReiser, I stole this faster than you could merge this PR ;-)

I wonder if we can also build the binary on main and then just download it on the PR 🤔

Uploading the binary from the test step doesn't work I think, it it seems to be uploading a debug build? path: target/debug/ruff.

MichaReiser · 2023-03-15T06:58:18Z

This is great! Two quick comments since I was working recently on the #2677 which also comments on the PR:

Thanks for that work. I used your job as inspiration on how we can improve the benchmarking, and it seems there's more for me to learn 😃

Comments: That's clever. Let me give this a try.
Re-using the binary: The benchmark uses the release build of ruff_benchmark which we aren't building as part of the pull_request workflow. However, we could do something good for the environment (it won't speed up the overall time) and build the binary for main once and then re-use when running the baseline benchmarks on PRs.

I wonder if we can also build the binary on main and then just download it on the PR thinking

Yes, I'll give that a try

Very clever approaches for speeding up CI times! Thank you @MichaReiser, I stole this faster than you could merge this PR ;-)

😆 Thank you! I saw your PR yesterday and was, wow, that was quick. I should update the PR summary to clarify that this workflow is heavily inspired by your work on Rome's and OXC's benchmarks!

github-actions · 2023-03-15T08:44:05Z

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Benchmark

Linux

group                        main                                   pr
-----                        ----                                   --
linter/large/dataset.py      1.00      9.1±0.09ms     4.5 MB/sec    1.01      9.2±0.04ms     4.4 MB/sec
linter/numpy/ctypeslib.py    1.00      3.0±0.04ms   111.8 MB/sec    1.01      3.0±0.00ms   111.2 MB/sec
linter/numpy/globals.py      1.00  1569.6±16.01µs   113.6 MB/sec    1.01   1590.0±4.44µs   112.1 MB/sec
linter/pydantic/types.py     1.00      4.3±0.04ms     6.0 MB/sec    1.02      4.3±0.01ms     5.9 MB/sec

Windows

group                        main                                   pr
-----                        ----                                   --
linter/large/dataset.py      1.00      9.2±0.15ms     4.4 MB/sec    1.00      9.2±0.45ms     4.4 MB/sec
linter/numpy/ctypeslib.py    1.00      3.0±0.07ms   113.8 MB/sec    1.00      2.9±0.06ms   114.3 MB/sec
linter/numpy/globals.py      1.00  1548.2±55.86µs   115.1 MB/sec    1.01  1567.2±63.16µs   113.7 MB/sec
linter/pydantic/types.py     1.00      4.3±0.16ms     6.0 MB/sec    1.01      4.4±0.17ms     5.9 MB/sec

MichaReiser · 2023-03-15T09:31:36Z

I implemented the recommended changes.

I haven't been able to test the caching because it requires triggering a build on main.... which I can only do after this PR lands.

EDIT: Caching on main isn't as straightforward as I have thought with our approach where we use cargo bench. cargo bench doesn't create a single binary but one binary for each benchmark. Each binary has a unique name linter-<hash>, that makes it difficult to extract. Building main should also take less time than building the PR branch because it is only necessary to build the difference.

One thing we could try is to setup a release build for main that warms the cache but I would prefer to leave this for another PR. Mainly because we should potentially re-visit our caching overall to avoid running into GitHubs 10GB cache limit, which results in cache-pruning (e.g. warm the cache on main for test, and release build, don't populate the cache from PR branches)

MichaReiser · 2023-03-15T09:32:47Z

.github/workflows/ci.yaml

      - name: Run ecosystem check
        run: |
          # Make executable, since artifact download doesn't preserve this
          chmod +x ruff ${{ steps.ruff-target.outputs.download-path }}/ruff

          scripts/check_ecosystem.py ruff ${{ steps.ruff-target.outputs.download-path }}/ruff | tee ecosystem-result
+          cat ecosystem-result > $GITHUB_STEP_SUMMARY


This adds the ecosystem output to the CI job summary

MichaReiser · 2023-03-15T09:33:22Z

.github/workflows/ci.yaml

@@ -80,6 +80,7 @@ jobs:
          # Setting RUSTDOCFLAGS because `cargo doc --check` isn't yet implemented (https://github.com/rust-lang/cargo/issues/10025).
          RUSTDOCFLAGS: "-D warnings"
      - uses: actions/upload-artifact@v3
+        if: ${{ matrix.os == 'ubuntu-latest' }}


We only need the linux binary. Trying to run this on windows creates a warning because the file is named ruff.exe and not ruff

charliermarsh

I love this

.github/workflows/pr-comment.yaml

MichaReiser force-pushed the ci/benchmark-step branch 16 times, most recently from 93a5248 to cb551d3 Compare March 14, 2023 08:39

MichaReiser force-pushed the ci/benchmark-step branch 2 times, most recently from 669a3cf to 1ac2016 Compare March 14, 2023 09:44

MichaReiser force-pushed the ci/benchmark-step branch from 65bae7d to 844d366 Compare March 14, 2023 10:02

MichaReiser marked this pull request as ready for review March 14, 2023 10:22

astral-sh deleted a comment from github-actions bot Mar 14, 2023

MichaReiser self-assigned this Mar 15, 2023

MichaReiser force-pushed the ci/benchmark-step branch 3 times, most recently from eec0f44 to ba8685c Compare March 15, 2023 08:42

MichaReiser force-pushed the ci/benchmark-step branch 5 times, most recently from 0e6c1b0 to b719de5 Compare March 15, 2023 09:13

astral-sh deleted a comment from github-actions bot Mar 15, 2023

MichaReiser force-pushed the ci/benchmark-step branch from b719de5 to 8e63a2a Compare March 15, 2023 09:30

MichaReiser requested a review from charliermarsh March 15, 2023 09:30

MichaReiser commented Mar 15, 2023

View reviewed changes

MichaReiser force-pushed the ci/benchmark-step branch 2 times, most recently from 5e11bb6 to c0566f3 Compare March 15, 2023 10:02

charliermarsh approved these changes Mar 15, 2023

View reviewed changes

.github/workflows/pr-comment.yaml Show resolved Hide resolved

MichaReiser added 5 commits March 16, 2023 08:38

Publish benchmark results

eae97be

Run benchmarks on single machine

72d5e5b

Cache main benchmark binary, merge with ecosystem-results comment

6a038ed

Remove main caching

792f83e

Add Ecosystem Header

fbb1bbc

MichaReiser force-pushed the ci/benchmark-step branch from 71f4a69 to fbb1bbc Compare March 16, 2023 07:41

MichaReiser merged commit aa51ece into main Mar 16, 2023

MichaReiser deleted the ci/benchmark-step branch March 16, 2023 08:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: Benchmark CI Step #3480

ci: Benchmark CI Step #3480

MichaReiser commented Mar 13, 2023 •

edited

MichaReiser commented Mar 14, 2023

MichaReiser commented Mar 14, 2023 •

edited

sciyoshi commented Mar 15, 2023

Boshen commented Mar 15, 2023 •

edited

MichaReiser commented Mar 15, 2023 •

edited

github-actions bot commented Mar 15, 2023 •

edited

MichaReiser commented Mar 15, 2023 •

edited

MichaReiser Mar 15, 2023

MichaReiser Mar 15, 2023

charliermarsh left a comment

ci: Benchmark CI Step #3480

ci: Benchmark CI Step #3480

Conversation

MichaReiser commented Mar 13, 2023 • edited

Test Plan

MichaReiser commented Mar 14, 2023

MichaReiser commented Mar 14, 2023 • edited

sciyoshi commented Mar 15, 2023

Boshen commented Mar 15, 2023 • edited

MichaReiser commented Mar 15, 2023 • edited

github-actions bot commented Mar 15, 2023 • edited

PR Check Results

Ecosystem

Benchmark

Linux

Windows

MichaReiser commented Mar 15, 2023 • edited

MichaReiser Mar 15, 2023

Choose a reason for hiding this comment

MichaReiser Mar 15, 2023

Choose a reason for hiding this comment

charliermarsh left a comment

Choose a reason for hiding this comment

MichaReiser commented Mar 13, 2023 •

edited

MichaReiser commented Mar 14, 2023 •

edited

Boshen commented Mar 15, 2023 •

edited

MichaReiser commented Mar 15, 2023 •

edited

github-actions bot commented Mar 15, 2023 •

edited

MichaReiser commented Mar 15, 2023 •

edited