ci: add performance benchmarks #4998

agostbiro · 2024-03-14T19:23:27Z

Adds automated performance regression checks for PRs and historic progress tracking in main. The benchmark replays RPC calls captured in third-party test suites.

Runner

We're using a self-hosted Github Action runner to run benchmarks in a reproducible environment on a bare metal instance with an Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz cpu. The runner is executed with an unprivileged user and standard security hardening was applied to the instance. The runner is ran as a systemd service with automatic restarts and reboot if many restarts fail.

I spent some time trying find a managed CI provider with a deterministic execution environment, but couldn't, so it seems we're stuck with self-hosted runners at least mid term.

Reports

We are using github-action-benchmark to track historic progress and raise regression alerts on PRs.

Historic

Benchmark data is to be stored in a separate repo. The Github pages in this repo will show historic benchmark data for main like this. (The example only includes the seaport scenario as the entire benchmark run is slow).

Regressions in PRs

When a performance regression is detected in a PR based on a preconfigured threshold and compared to the latest commit in main, the following error is displayed in the PR:

Clicking into the workflow, the following table is displayed giving information about the magnitude of the regression (the Markdown isn't rendered properly, this is a bug in github-action-benchmark):

It is a failure if there is a regression on any of the metrics.

Snapshot testing

The benchmark workflow checks that the same RPC calls succeed and fail as when the scenario was collected. If the snapshot doesn't match, the workflow fails and the difference is reported.

TODOs for this PR

Update expected failures snapshot in the repo to include all scenarios
Calibrate regression alert threshold percentage
Create benchmark data repo under NomicFoundation and configure permissions and self-hosted runner in NomicFoundation/hardhat

Limitations/things to improve in future

Full run takes an hour and there is only one runner so tasks might queue up. We could improve things by making the benchmark run faster and by introducing more runners.
Only runs for team members’ and collaborators’ PRs as it's unsafe to allow third-parties to run code on self-hosted runners according to Github.
Currently we rerun the benchmark in main even if it was already ran in a PR that was merged. This could be probably avoided.
Can't use Swatinem/rust-cache@v2 in the workflow as it needs sudo on the runner which is too dangerous. We should look into alternatives to prevent recompiling from scratch on every run.
The regression error message doesn’t render Markdown properly. I’ll open an issue about this in github-action-benchmark.
github-action-benchmark has a feature to comment on the PR which would allow seeing the regression results easier, but this is blocked by the action using the same Github token to push to the benchmark result repo and to comment. We’d need to open a PR in the action repo to support it.

changeset-bot · 2024-03-14T19:23:31Z

⚠️ No Changeset found

Latest commit: 082edb9

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel · 2024-03-14T19:23:32Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
hardhat	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Mar 19, 2024 4:38pm

agostbiro · 2024-03-14T19:24:14Z

.github/workflows/edr-benchmark.yml

+    paths:
+      - ".github/workflows/edr-benchmark.yml"
+      - "rust-toolchain"
+      - "Cargo.lock"
+      - "Cargo.toml"
+      - "crates/**"
+  pull_request:
+    branches:
+      - "**"
+    paths:
+      - ".github/workflows/edr-benchmark.yml"
+      - "rust-toolchain"
+      - "Cargo.lock"
+      - "Cargo.toml"
+      - "crates/**"


Only running for changes to EDR since it's very slow and there is only one runner + we'll do this anyway once EDR is a separate repo

agostbiro · 2024-03-14T19:24:58Z

crates/tools/js/benchmark/index.js

+  parser.add_argument("-o", "--benchmark-output", {
+    type: "str",
+    default: "./benchmark-output.json",
+    help: "Where to save the benchmark output file",
+  });


Save this to disk instead of stdout since multiple commands operate on it

Wodann

LGTM! Thank you for the clear explanation in the PR description. That was very helpful!

I just have one question to clarify that my understanding of the GitHub workflow is correct.

.github/workflows/edr-benchmark.yml

This reverts commit eccb51c.

nodejs/node#11683

agostbiro · 2024-03-19T18:28:10Z

Follow up: NomicFoundation/edr#337

github-actions bot assigned kanej Mar 14, 2024

github-actions bot added the status:triaging label Mar 14, 2024

agostbiro commented Mar 14, 2024

View reviewed changes

ci: add performance benchmarks

Verified

This commit was signed with the committer’s verified signature.

willdurand William Durand

GPG key ID: 804B6263A7476675

Verified
Learn about vigilant mode

Loading
Loading status checks…

ef89eb0

agostbiro force-pushed the ci/performance-benchmarks branch from 830788c to ef89eb0 Compare March 14, 2024 19:25

agostbiro assigned agostbiro and unassigned kanej Mar 14, 2024

agostbiro requested a review from Wodann March 14, 2024 19:26

agostbiro added no changeset needed area:edr and removed status:triaging labels Mar 14, 2024

vercel bot deployed to Preview March 14, 2024 19:27 View deployment

agostbiro linked an issue Mar 14, 2024 that may be closed by this pull request

Implement automated regression checks for PRs NomicFoundation/edr#312

Closed

Wodann approved these changes Mar 14, 2024

View reviewed changes

.github/workflows/edr-benchmark.yml Outdated Show resolved Hide resolved

agostbiro added 2 commits March 15, 2024 12:26

Add snapshot for all scenarios

Verified

This commit was signed with the committer’s verified signature.

willdurand William Durand

GPG key ID: 804B6263A7476675

Verified
Learn about vigilant mode

4b6841d

Don't try to run snapshot as scenario

Verified

This commit was signed with the committer’s verified signature.

willdurand William Durand

GPG key ID: 804B6263A7476675

Verified
Learn about vigilant mode

Loading
Loading status checks…

a4eae0f

vercel bot deployed to Preview March 15, 2024 11:28 View deployment

Fix check for trusted collaborators

Loading
Loading status checks…

50eb9b0

vercel bot deployed to Preview March 15, 2024 14:09 View deployment

Allow running in main

Loading
Loading status checks…

5d863d3

vercel bot deployed to Preview March 15, 2024 14:12 View deployment

agostbiro added 2 commits March 15, 2024 15:58

Update results repository

0146afd

Spawn subprocess for each scenario

Loading
Loading status checks…

42e7080

vercel bot deployed to Preview March 15, 2024 16:02 View deployment

agostbiro added 2 commits March 15, 2024 18:25

Fix output being truncated

20212c8

Flush stdout explicitly

Loading
Loading status checks…

fa75948

vercel bot deployed to Preview March 15, 2024 17:54 View deployment

agostbiro added 2 commits March 18, 2024 23:09

Revert "Temporarily disable get block by number"

ca32351

This reverts commit eccb51c.

Add nodejs flags to reduce variance

Loading
Loading status checks…

206deb1

nodejs/node#11683

vercel bot deployed to Preview March 18, 2024 22:12 View deployment

Add nodejs flags to reduce variance to subprocesses

Loading
Loading status checks…

65165fe

vercel bot deployed to Preview March 19, 2024 08:39 View deployment

Increase --max-old-space-size=28000

f1be692

vercel bot deployed to Preview March 19, 2024 14:19 View deployment

Hold on to RPC results to avoid GC

Loading
Loading status checks…

acad1ee

agostbiro force-pushed the ci/performance-benchmarks branch from 1dc3388 to acad1ee Compare March 19, 2024 14:19

vercel bot deployed to Preview March 19, 2024 14:21 View deployment

Increase alert threshold to 110%

Loading
Loading status checks…

eb61f90

vercel bot deployed to Preview March 19, 2024 16:19 View deployment

agostbiro mentioned this pull request Mar 19, 2024

Calibrate performance alert threshold NomicFoundation/edr#337

Closed

Temp

Loading
Loading status checks…

03eb2b0

vercel bot deployed to Preview March 19, 2024 16:24 View deployment

Debug

Loading
Loading status checks…

32f7769

agostbiro had a problem deploying to github-action-benchmark March 19, 2024 16:27 — with GitHub Actions Failure

Debug

Loading
Loading status checks…

46c0445

agostbiro had a problem deploying to github-action-benchmark March 19, 2024 16:28 — with GitHub Actions Error

vercel bot deployed to Preview March 19, 2024 16:29 View deployment

Debug

Loading
Loading status checks…

68f60dd

agostbiro had a problem deploying to github-action-benchmark March 19, 2024 16:34 — with GitHub Actions Error

vercel bot deployed to Preview March 19, 2024 16:35 View deployment

Remove debug

Loading
Loading status checks…

082edb9

agostbiro had a problem deploying to github-action-benchmark March 19, 2024 16:37 — with GitHub Actions Failure

vercel bot deployed to Preview March 19, 2024 16:38 View deployment

agostbiro merged commit 7164c5d into main Mar 19, 2024
42 of 43 checks passed

agostbiro deleted the ci/performance-benchmarks branch March 19, 2024 18:28

github-actions bot locked as resolved and limited conversation to collaborators Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add performance benchmarks #4998

ci: add performance benchmarks #4998

agostbiro commented Mar 14, 2024 •

edited

Loading

changeset-bot bot commented Mar 14, 2024 •

edited

Loading

vercel bot commented Mar 14, 2024 •

edited

Loading

agostbiro Mar 14, 2024

agostbiro Mar 14, 2024 •

edited

Loading

Wodann left a comment

agostbiro commented Mar 19, 2024

ci: add performance benchmarks #4998

ci: add performance benchmarks #4998

Conversation

agostbiro commented Mar 14, 2024 • edited Loading

Runner

Reports

Historic

Regressions in PRs

Snapshot testing

TODOs for this PR

Limitations/things to improve in future

changeset-bot bot commented Mar 14, 2024 • edited Loading

⚠️ No Changeset found

vercel bot commented Mar 14, 2024 • edited Loading

agostbiro Mar 14, 2024

Choose a reason for hiding this comment

agostbiro Mar 14, 2024 • edited Loading

Choose a reason for hiding this comment

Wodann left a comment

Choose a reason for hiding this comment

agostbiro commented Mar 19, 2024

agostbiro commented Mar 14, 2024 •

edited

Loading

changeset-bot bot commented Mar 14, 2024 •

edited

Loading

vercel bot commented Mar 14, 2024 •

edited

Loading

agostbiro Mar 14, 2024 •

edited

Loading