Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add performance benchmarks #4998

Merged
merged 26 commits into from
Mar 19, 2024
Merged

ci: add performance benchmarks #4998

merged 26 commits into from
Mar 19, 2024

Conversation

agostbiro
Copy link
Member

@agostbiro agostbiro commented Mar 14, 2024

Adds automated performance regression checks for PRs and historic progress tracking in main. The benchmark replays RPC calls captured in third-party test suites.

Runner

We're using a self-hosted Github Action runner to run benchmarks in a reproducible environment on a bare metal instance with an Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz cpu. The runner is executed with an unprivileged user and standard security hardening was applied to the instance. The runner is ran as a systemd service with automatic restarts and reboot if many restarts fail.

I spent some time trying find a managed CI provider with a deterministic execution environment, but couldn't, so it seems we're stuck with self-hosted runners at least mid term.

Reports

We are using github-action-benchmark to track historic progress and raise regression alerts on PRs.

Historic

Benchmark data is to be stored in a separate repo. The Github pages in this repo will show historic benchmark data for main like this. (The example only includes the seaport scenario as the entire benchmark run is slow).

Regressions in PRs

When a performance regression is detected in a PR based on a preconfigured threshold and compared to the latest commit in main, the following error is displayed in the PR:

Screenshot 2024-03-14 at 19 13 57

Clicking into the workflow, the following table is displayed giving information about the magnitude of the regression (the Markdown isn't rendered properly, this is a bug in github-action-benchmark):

Screenshot 2024-03-14 at 19 14 11

It is a failure if there is a regression on any of the metrics.

Snapshot testing

The benchmark workflow checks that the same RPC calls succeed and fail as when the scenario was collected. If the snapshot doesn't match, the workflow fails and the difference is reported.

TODOs for this PR

  • Update expected failures snapshot in the repo to include all scenarios
  • Calibrate regression alert threshold percentage
  • Create benchmark data repo under NomicFoundation and configure permissions and self-hosted runner in NomicFoundation/hardhat

Limitations/things to improve in future

  • Full run takes an hour and there is only one runner so tasks might queue up. We could improve things by making the benchmark run faster and by introducing more runners.
  • Only runs for team members’ and collaborators’ PRs as it's unsafe to allow third-parties to run code on self-hosted runners according to Github.
  • Currently we rerun the benchmark in main even if it was already ran in a PR that was merged. This could be probably avoided.
  • Can't use Swatinem/rust-cache@v2 in the workflow as it needs sudo on the runner which is too dangerous. We should look into alternatives to prevent recompiling from scratch on every run.
  • The regression error message doesn’t render Markdown properly. I’ll open an issue about this in github-action-benchmark.
  • github-action-benchmark has a feature to comment on the PR which would allow seeing the regression results easier, but this is blocked by the action using the same Github token to push to the benchmark result repo and to comment. We’d need to open a PR in the action repo to support it.

Sorry, something went wrong.

Copy link

changeset-bot bot commented Mar 14, 2024

⚠️ No Changeset found

Latest commit: 082edb9

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link

vercel bot commented Mar 14, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
hardhat ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 19, 2024 4:38pm

Comment on lines +7 to +21
paths:
- ".github/workflows/edr-benchmark.yml"
- "rust-toolchain"
- "Cargo.lock"
- "Cargo.toml"
- "crates/**"
pull_request:
branches:
- "**"
paths:
- ".github/workflows/edr-benchmark.yml"
- "rust-toolchain"
- "Cargo.lock"
- "Cargo.toml"
- "crates/**"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only running for changes to EDR since it's very slow and there is only one runner + we'll do this anyway once EDR is a separate repo

Comment on lines +28 to +32
parser.add_argument("-o", "--benchmark-output", {
type: "str",
default: "./benchmark-output.json",
help: "Where to save the benchmark output file",
});
Copy link
Member Author

@agostbiro agostbiro Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Save this to disk instead of stdout since multiple commands operate on it

Verified

This commit was signed with the committer’s verified signature.
willdurand William Durand
@agostbiro agostbiro force-pushed the ci/performance-benchmarks branch from 830788c to ef89eb0 Compare March 14, 2024 19:25
@agostbiro agostbiro assigned agostbiro and unassigned kanej Mar 14, 2024
@agostbiro agostbiro requested a review from Wodann March 14, 2024 19:26
@agostbiro agostbiro added no changeset needed This PR doesn't require a changeset area:edr and removed status:triaging labels Mar 14, 2024
@agostbiro agostbiro linked an issue Mar 14, 2024 that may be closed by this pull request
Copy link
Member

@Wodann Wodann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for the clear explanation in the PR description. That was very helpful!

I just have one question to clarify that my understanding of the GitHub workflow is correct.

Verified

This commit was signed with the committer’s verified signature.
willdurand William Durand

Verified

This commit was signed with the committer’s verified signature.
willdurand William Durand
@agostbiro agostbiro had a problem deploying to github-action-benchmark March 19, 2024 16:27 — with GitHub Actions Failure
@agostbiro
Copy link
Member Author

Follow up: NomicFoundation/edr#337

@agostbiro agostbiro merged commit 7164c5d into main Mar 19, 2024
42 of 43 checks passed
@agostbiro agostbiro deleted the ci/performance-benchmarks branch March 19, 2024 18:28
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area:edr no changeset needed This PR doesn't require a changeset
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Implement automated regression checks for PRs
3 participants