-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: add performance benchmarks #4998
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
paths: | ||
- ".github/workflows/edr-benchmark.yml" | ||
- "rust-toolchain" | ||
- "Cargo.lock" | ||
- "Cargo.toml" | ||
- "crates/**" | ||
pull_request: | ||
branches: | ||
- "**" | ||
paths: | ||
- ".github/workflows/edr-benchmark.yml" | ||
- "rust-toolchain" | ||
- "Cargo.lock" | ||
- "Cargo.toml" | ||
- "crates/**" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only running for changes to EDR since it's very slow and there is only one runner + we'll do this anyway once EDR is a separate repo
parser.add_argument("-o", "--benchmark-output", { | ||
type: "str", | ||
default: "./benchmark-output.json", | ||
help: "Where to save the benchmark output file", | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Save this to disk instead of stdout since multiple commands operate on it
830788c
to
ef89eb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for the clear explanation in the PR description. That was very helpful!
I just have one question to clarify that my understanding of the GitHub workflow is correct.
This reverts commit eccb51c.
1dc3388
to
acad1ee
Compare
Follow up: NomicFoundation/edr#337 |
Adds automated performance regression checks for PRs and historic progress tracking in
main
. The benchmark replays RPC calls captured in third-party test suites.Runner
We're using a self-hosted Github Action runner to run benchmarks in a reproducible environment on a bare metal instance with an
Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz
cpu. The runner is executed with an unprivileged user and standard security hardening was applied to the instance. The runner is ran as asystemd
service with automatic restarts and reboot if many restarts fail.I spent some time trying find a managed CI provider with a deterministic execution environment, but couldn't, so it seems we're stuck with self-hosted runners at least mid term.
Reports
We are using github-action-benchmark to track historic progress and raise regression alerts on PRs.
Historic
Benchmark data is to be stored in a separate repo. The Github pages in this repo will show historic benchmark data for
main
like this. (The example only includes theseaport
scenario as the entire benchmark run is slow).Regressions in PRs
When a performance regression is detected in a PR based on a preconfigured threshold and compared to the latest commit in
main
, the following error is displayed in the PR:Clicking into the workflow, the following table is displayed giving information about the magnitude of the regression (the Markdown isn't rendered properly, this is a bug in github-action-benchmark):
It is a failure if there is a regression on any of the metrics.
Snapshot testing
The benchmark workflow checks that the same RPC calls succeed and fail as when the scenario was collected. If the snapshot doesn't match, the workflow fails and the difference is reported.
TODOs for this PR
NomicFoundation
and configure permissions and self-hosted runner inNomicFoundation/hardhat
Limitations/things to improve in future
main
even if it was already ran in a PR that was merged. This could be probably avoided.Swatinem/rust-cache@v2
in the workflow as it needs sudo on the runner which is too dangerous. We should look into alternatives to prevent recompiling from scratch on every run.