Move alignment checks to codegen #117473

saethlin · 2023-11-01T00:53:25Z

Implementing UB checks entirely in a MIR transform is quite limiting. Since MIR transforms work on polymorphic MIR we don't know for sure what all our types are, and sometimes we just have to give up on inserting a check. For example we used to emit MIR to compute the alignment mask at runtime, because the pointee type could be generic. and we used to skip alignment checks where we weren't sure the pointee was sized. Implementing the checks in codegen frees us from those problems, because we get to deal with monomorphized types.

Initially I implemented this by stripping down the MIR pass to insert a new terminator, which codegen would lower to a check if it saw fit. That's the perf run that has no regression: #117473 (comment). Since then, I've decided that the better strategy is to do this entirely in codegen. Only touching codegen dramatically reduces the amount of code in the compiler that this needs to touch, and it means we will insert checks into functions from the standard library which get codegenned in a crate compiled with debug assertions. Previously, *misaligned_ptr would be checked, but misaligned_ptr.read() would not. With this PR, now it is. With this PR, we get checks in ptr::read. That's this perf run: #117473 (comment)

The only thing that jumps out at me about this codegen change is that between any two statements, codegen can change which backend block it is generating code for without changing the current MIR block. We already do insert blocks on the fly for panics, but in that case we don't stay in the new block.

I'm writing this with the expectation that I implement the niche checks in the same manner, because they have the same problem with polymorphic MIR, possibly worse.

I did a GitHub code search and the only users of the old opt-out which was -Zmir-enable-passes=-CheckAlignment were turning it off because of the problem with i686-pc-windows-msvc, but that shouldn't be a problem anymore because we don't emit alignment checks on that target. Note that -Zmir-enable-passes=-CheckAlignment will silently stop doing anything. We never check that the passes given to -Zmir-enable-passes actually match the names of any actual passes.

If the new checks cause issues, users now have the opt-out from #123411: -Zub-checks=no.

compiler/rustc_codegen_cranelift/src/base.rs

saethlin · 2023-11-12T02:12:34Z

@bors try @rust-timer queue

bors · 2023-11-12T02:13:43Z

⌛ Trying commit 165048a with merge 8d257b9...

…=<try> Move alignment checks to codegen Implementing UB checks entirely in a MIR transform is quite limiting, we don't know for sure what all our types are so we need to make a lot of sacrifices. For example in here we used to emit MIR to compute the alignment mask at runtime, because the pointee type could be generic. Implementing the checks in codegen frees us from that requirement, because we get to deal with monomorphized types. But I don't think we can move these checks entirely into codegen, because inserting the check needs to insert a new terminator into a basic block, which splits the previous basic block into two. We can't add control flow like this in codegen, but we can in MIR. So now the MIR transform just inserts a `TerminatorKind::UbCheck` which is effectively a `Goto` that also reads an `Operand` (because it either goes to the target block or terminates), and codegen expands that new terminator into the actual check. --- Also I'm writing this with the expectation that I implement the niche checks in the same manner, because they have the same problem with polymorphic MIR, possibly worse. r? `@ghost`

bors · 2023-11-12T03:39:34Z

☀️ Try build successful - checks-actions
Build commit: 8d257b9 (8d257b961e0625aa5abc7146a6ee857f6182c524)

rust-timer · 2023-11-12T06:35:58Z

Finished benchmarking commit (8d257b9): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.4%	[0.4%, 4.4%]	2
Regressions ❌ (secondary)	1.8%	[0.7%, 4.2%]	4
Improvements ✅ (primary)	-1.7%	[-4.1%, -0.0%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.3%	[-4.1%, 4.4%]	6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.5%, 0.5%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.4%	[-2.7%, -2.1%]	4
All ❌✅ (primary)	0.5%	[0.5%, 0.5%]	1

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.0%, 0.5%]	10
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.0%	[-0.1%, -0.0%]	8
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.0%	[-0.1%, 0.5%]	18

Bootstrap: 674.671s -> 673.178s (-0.22%)
Artifact size: 311.12 MiB -> 311.05 MiB (-0.02%)

bors · 2024-03-29T02:45:02Z

☔ The latest upstream changes (presumably #122671) made this pull request unmergeable. Please resolve the merge conflicts.

bjorn3 · 2024-04-06T07:34:17Z

compiler/rustc_codegen_cranelift/src/base.rs

+    );
+
+    // Continue codegen in the success block
+    fx.bcx.switch_to_block(success);


Please add a nop in the success block. This is necessary for clif ir printing annotations to work correctly.

Why don't we need a nop in the panic block?

It may be necessary there too.

bors · 2024-04-07T02:21:48Z

☔ The latest upstream changes (presumably #123576) made this pull request unmergeable. Please resolve the merge conflicts.

saethlin · 2024-04-07T04:31:08Z

r? oli-obk

rustbot · 2024-04-07T04:31:11Z

This PR changes MIR

cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @celinval, @vakaras

The Miri subtree was changed

cc @rust-lang/miri

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

This PR changes Stable MIR

cc @oli-obk, @celinval, @ouz-a

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

celinval · 2024-04-07T21:26:10Z

compiler/stable_mir/src/mir/body.rs

@@ -246,7 +246,6 @@ pub enum AssertMessage {
    RemainderByZero(Operand),
    ResumedAfterReturn(CoroutineKind),
    ResumedAfterPanic(CoroutineKind),
-    MisalignedPointerDereference { required: Operand, found: Operand },


Can you please just mark this as deprecated for now instead or removing it? Thanks

bors · 2024-05-10T18:27:13Z

☔ The latest upstream changes (presumably #124972) made this pull request unmergeable. Please resolve the merge conflicts.

celinval · 2024-05-10T20:50:51Z

Has anyone considered creating MIR passes on monomorphic MIR? I see a pattern of pushing things to codegen that should really be implemented as instrumentation passes. The code generator shouldn't be creating new basic blocks.

saethlin · 2024-05-10T21:14:20Z

Has anyone considered creating MIR passes on monomorphic MIR?

Yes. Many times. Nobody is happy with the amount of cleverness in codegen.

MIR is monomorphized on-the-fly as an optimization, because otherwise we'd have to clone all MIR bodies at codegen so that we can mutate them. Or we could probably have a really complicated accessor for the MIR like MirPatch? MirPatch doesn't support most of the operations I've done in codegen, and it doesn't fold in access to the patched state.

celinval · 2024-05-10T23:59:13Z

Do you know what the overhead would be if we clone the bodies lazily, just for functions that need transformation? For example, I'm assuming these checks would only be required in functions that perform unsafe operations.

saethlin · 2024-05-11T00:14:01Z

Do you know what the overhead would be if we clone the bodies lazily, just for functions that need transformation? For example, I'm assuming these checks would only be required in functions that perform unsafe operations.

I think you have a bit of an optimistic view of the situation based on only looking at the changes in this PR. Consider also #121174. And also, if we had such a change I would like to use it to do SimplifyCfg on monomorphic MIR, to clean up the result of this traversal strategy:

rust/compiler/rustc_codegen_ssa/src/mir/mod.rs

Lines 270 to 281 in 6e1d947

    
           let reachable_blocks = traversal::mono_reachable_as_bitset(mir, cx.tcx(), instance); 
        
           // Codegen the body of each block using reverse postorder 
        
           for (bb, _) in traversal::reverse_postorder(mir) { 
        
               if reachable_blocks.contains(bb) { 
        
                   fx.codegen_block(bb); 
        
               } else { 
        
                   // We want to skip this block, because it's not reachable. But we still create 
        
                   // the block so terminators in other blocks can reference it. 
        
                   fx.codegen_block_as_unreachable(bb); 
        
               } 
        
           }

I know from looking at the IR we produce that the mono-reachable traversal produces goto chains. Like everything else here, that optimization is possible to implement in a lazy fashion without some MIR to mutate, but it would be complicated.

saethlin · 2024-05-12T13:02:53Z

Based on what I'm seeing in #125025, maybe cloning all the MIR is not too expensive.

celinval · 2024-05-13T22:46:36Z

Based on what I'm seeing in #125025, maybe cloning all the MIR is not too expensive.

That's similar to our findings when we migrated to using StableMIR in Kani. StableMIR supports monomorphic bodies for instances.

RalfJung · 2024-05-14T05:40:00Z

FWIW, MIR also supports monomorphic bodies -- in the MIR-to-MiniRust translation, we monomorphize the entire MIR body before translating it.

celinval · 2024-05-14T17:24:30Z

FWIW, MIR also supports monomorphic bodies -- in the MIR-to-MiniRust translation, we monomorphize the entire MIR body before translating it.

Yes, that's how StableMIR is implemented too, but I believe you still need to clone the body.

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 1, 2023

saethlin added the T-opsem Relevant to the opsem team label Nov 1, 2023

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from c0b5969 to 487ffa6 Compare November 1, 2023 03:36

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from 487ffa6 to f976ac6 Compare November 1, 2023 04:10

This comment has been minimized.

Sign in to view

bjorn3 reviewed Nov 1, 2023

View reviewed changes

compiler/rustc_codegen_cranelift/src/base.rs Outdated Show resolved Hide resolved

saethlin force-pushed the codegen-alignment-checks branch from f976ac6 to b8cc419 Compare November 1, 2023 13:03

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from b8cc419 to f6feccf Compare November 1, 2023 20:31

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from f6feccf to 72aaa7d Compare November 1, 2023 21:18

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from 72aaa7d to df639ed Compare November 1, 2023 21:27

This comment was marked as outdated.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from df639ed to 9fc6dde Compare November 5, 2023 03:11

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from 9fc6dde to 4c33915 Compare November 12, 2023 00:22

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from 4c33915 to 165048a Compare November 12, 2023 01:27

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 12, 2023

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from 45ba3cb to 07baa1c Compare March 22, 2024 23:09

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from 07baa1c to d86b697 Compare March 22, 2024 23:46

saethlin force-pushed the codegen-alignment-checks branch from d86b697 to 4efda2f Compare April 5, 2024 21:12

This comment has been minimized.

Sign in to view

saethlin force-pushed the codegen-alignment-checks branch from 4efda2f to 9bf53ba Compare April 5, 2024 22:50

bjorn3 reviewed Apr 6, 2024

View reviewed changes

saethlin force-pushed the codegen-alignment-checks branch 2 times, most recently from 9b79098 to 770ca3d Compare April 7, 2024 04:25

saethlin marked this pull request as ready for review April 7, 2024 04:31

rustbot assigned oli-obk Apr 7, 2024

celinval reviewed Apr 7, 2024

View reviewed changes

saethlin force-pushed the codegen-alignment-checks branch from 770ca3d to fa98120 Compare April 7, 2024 21:57

saethlin force-pushed the codegen-alignment-checks branch from fa98120 to 52f2d3f Compare May 10, 2024 23:15

This comment has been minimized.

Sign in to view

Move alignment checks to codegen

0bc84fa

saethlin force-pushed the codegen-alignment-checks branch from 52f2d3f to 0bc84fa Compare May 11, 2024 00:16

Move alignment checks to codegen #117473

Are you sure you want to change the base?

Move alignment checks to codegen #117473

Conversation

saethlin commented Nov 1, 2023 • edited

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment was marked as outdated.

This comment has been minimized.

This comment has been minimized.

saethlin commented Nov 12, 2023

This comment has been minimized.

bors commented Nov 12, 2023

bors commented Nov 12, 2023

This comment has been minimized.

rust-timer commented Nov 12, 2023

Overall result: no relevant changes - no action needed

This comment has been minimized.

bors commented Mar 29, 2024

This comment has been minimized.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bors commented Apr 7, 2024

saethlin commented Apr 7, 2024

rustbot commented Apr 7, 2024

Choose a reason for hiding this comment

bors commented May 10, 2024

celinval commented May 10, 2024

saethlin commented May 10, 2024

This comment has been minimized.

celinval commented May 10, 2024

saethlin commented May 11, 2024

saethlin commented May 12, 2024

celinval commented May 13, 2024

RalfJung commented May 14, 2024

celinval commented May 14, 2024

saethlin commented Nov 1, 2023 •

edited