-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: enable cranelift (switch to nightly) #4498
Conversation
because we have so many crates, all are already compiling in parallel. if you compile one project then it will speed up. like |
That's a rare use-case, a chunk of that time is already fast linking with mold, and then it will be like 2 seconds anyway. @elsirion I remember you showing a pretty graph showing cpus were under-utilized and mentioning lots of cores. Do you happen to have a machine like that? BTW. What was that tool that generated that pretty graph? |
cargo build --timings maybe? |
See my edit above. I don't think that's it. |
that's |
also we need to bench because that is only runs frontend (which was parallelized in rustc multithread) |
Not going to fix the padding, too much work. :D Notably when threads are enabled the I guess on system that have more, but slower cores the gain might be better(?). |
Master:
This branch:
So this branch is slower in most stuff. |
I see a slight improvement in release mode, but might be noise, debug is slightly slower:
|
I was going to close it, but maybe I could try https://www.reddit.com/r/rust/comments/1bgyo8a/try_cranelift_codegen_backend_for_faster_compile/ while at it. |
Nice! Before cranelift:
After cranelift:
|
Before:
After:
I like! 😍 |
It's green. It's only enabled in dev builds, so IMO low risk. Let's give it a spin! |
07d39c1
to
b0d83fd
Compare
Bummer.
|
OK, so this can be fixed by updating nightly. 🍀 But now the problem is that dev builds and tests are much slower. 5x or so. I don't think anyone wants to wait 1-2 minutes to open |
How so? |
That's the whole thing with cranelift - it is supposed to be faster at the expense of optimization power. It only have basic optimizations currently, maybe it will get somewhat better in the future. In our case it looks like consensus is getting extremely slow. Possibly just some crypto suffers from lack of certain optimization? Hard to tell. I tried increasing optimization level on certain things, but without success. |
For cryptography related crates this project is using opt-level = 3: Lines 107 to 125 in 29cf90e
codegen-backend = "llvm" to each of these package profile overrides (so for example secp256k1 = { opt-level = 3, codegen-backend = "llvm" } .) Be aware however that there is currently an ABI incompatibility around 128bit integers: rust-lang/rustc_codegen_cranelift#1449 I haven't gotten around fixing it properly yet.
ps: This is the project you mentioned in https://lobste.rs/s/tjh7oy/cranelift_code_generation_comes_rust#c_gq7fp3, right? Edit: Opened rust-lang/rustc_codegen_cranelift#1468 to track emitting a warning for this case to reduce confusion in the future. |
@bjorn3 Oh, it's already possible to pick these per-crate? Awesome. I'll try it out and report back. Thanks for letting us know, and I'll sub these issues and keep an eye on it. |
Did another round of testing, and pushed a best approach Current master branch:
after with cranelift + parallel (but now with llvm for perf-heavy crates):
just cranelift:
no -O3 on local crypto crates (fedimint-threshold-crypto) + cranelift:
O3 -> O2 + above:
With this setup the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we tested this on mac @justinmoon @dpc @maan2003 ? Merge if so.
When I checkout the branch and direnv runs, the nix environment doesn't build.
|
OH. Of course. Great. I could make this component conditional, but then we would need to make the backend setting conditional on the current system, in Cargo.toml. And that is probably possibly only via build.rs setting up some feature flag, at which point it doesn't seem all that worthwhile. |
Close the PR then for now? |
can we have rustc wrapper that ignores the codegen-backend arguments on macos |
61s => 40s on a fast system will be insane speed for slow systems. |
There's only so much complexity we want to deal with. In a couple of months MacOS support might be already implemented, and we can try-again. |
Not planning to land it right now.
On top of pending PRs, due to possibility of merge conflicts.
See https://blog.rust-lang.org/2023/11/09/parallel-rustc.html
On my machine I've noticed zero speedup. It might speed up someone's compilation time, but I guess it would require lots of cpu cores. (more than 20 I have?).
Use
just bench-compilation
to bench. If anyone sees a speedup and it seems beneficial, we could land. Otherwise I'm a bit dissapointed.