Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving #[rustc_box] to move_val_init intrinsic #110715

Closed
est31 opened this issue Apr 23, 2023 · 28 comments
Closed

Moving #[rustc_box] to move_val_init intrinsic #110715

est31 opened this issue Apr 23, 2023 · 28 comments
Labels
A-allocators Area: Custom and system allocators T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@est31
Copy link
Member

est31 commented Apr 23, 2023

I have opened this issue for discussing how to move #[rustc_box] (introduced by #97293) to something more simpler. My original proposal (reproduced below) was to use builtin # syntax for it, but through the discussion, it was brought up that there used to be a move_val_init intrinsic which would have been a good fit. Initial tests, both with a builtin # ptr_write and move_val_init directly showed promising results: it can deliver the same codegen as #[rustc_box] can, which means that one could switch the vec![] macro from using #[rustc_box] towards Box::new_uninit_slice, then writing into the pointer with move_val_init, then calling asusme_init on it.

Right now, #[rustc_box] desugars to THIR ExprKind::Box, which desugars to code calling exchange_malloc, followed by ShallowInitBox, a construct specifically added to support box syntax, and then followed by something equivalent to move_val_init: a write into the pointer without putting the value into a local first.

A move to using move_val_init directly would simplify the code that desugars #[rustc_box] at the cost of the code in the vec![] macro. To me that seems like a win because it would make the code around creating boxes less special.


Original proposal:

I have opened this issue for discussing how to move #[rustc_box] (introduced by #97293) to builtin# syntax (#110680).

My original proposal was here, to introduce builtin#ptr_write, but I made that proposal while being unaware of ShallowInitBox. So maybe we'd need both builtin#ptr_write and builtin#shallow_init_box.

The options I see:

  • Add builtin#box instead of #[rustc_box], replacing it 1:1.
  • Add builtin#ptr_write and builtin#shallow_init_box, first doing latter and then writing into it doing former. But what would the latter return on a type level? A Box<T>?
  • Add builtin#ptr_write to then do in the Vec macro: first Box::new_uninit_slice, then builtin#ptr_write, then call assume_init on it. This mirrors the proposal by oli-obk but the issue seems to be very poor codegen (new godbolt example based on the one linked in the ShallowInitBox MCP). The issue might just be due to the .write call though.
  • Only add builtin#shallow_init_box. I'm not sure this will work though as the magic seems to hide in the pointer writing.

Maybe one could first add builtin#ptr_write without touching #[rustc_box] and then check if the godbolt example still has that behaviour?

cc @nbdd0121 @drmeepster @clubby789 @oli-obk

Earlier discussion: #110694 (comment)

@clubby789
Copy link
Contributor

clubby789 commented Apr 24, 2023

I implemented a basic builtin#ptr_write and the codegen looks about equivalent:

#![feature(builtin_syntax)]
#![feature(new_uninit)]
type Type = [u8; 128];
pub fn foo2(f: fn() -> Type) -> Box<Type> {
    let mut b = Box::<Type>::new_uninit();
    unsafe { 
    	builtin#ptr_write(f(), b.as_mut_ptr());
        b.assume_init()
    }
}
_ZN3poc4foo217hf17d37ea238339f3E:
	push	r14
	push	rbx
	push	rax
	mov	rbx, rdi
	mov	edi, 128
	mov	esi, 1
	call	qword ptr [rip + __rust_alloc@GOTPCREL]
	test	rax, rax
	je	.LBB0_1
	mov	r14, rax
	mov	rdi, rax
	call	rbx
	mov	rax, r14
	add	rsp, 8
	pop	rbx
	pop	r14
	ret
.LBB0_1:
	mov	edi, 128
	mov	esi, 1
	call	qword ptr [rip + _ZN5alloc5alloc18handle_alloc_error17h9f155cb3ff8eda02E@GOTPCREL]
	ud2

If there's interest I can push my work (based on top of the offset_of PR) although it's very rudimentary, missing proper type checking and probably not constructing MIR properly

@est31
Copy link
Member Author

est31 commented Apr 24, 2023

@clubby789 that looks very promising! I think this gives good support for builtin#ptr_write. A PR might be too early, I'd suggest opening one once #110694 is closer to getting merged (after an initial round of reviews for example).

@nbdd0121
Copy link
Contributor

Although we could shift library code to do box creation, followed by pointer write and then type conversion, I still think the best thing to do is to make Box::new a lang item and magically replace it during the lowering, and therefore allow all users to enjoy the benefit.

My patch in #87781 didn't work due to interaction with two-phase borrow, but that should be fixable.

@nbdd0121
Copy link
Contributor

ShallowInitBox (rust-lang/compiler-team#460) converts a *mut T to Box<T>. What's magical is that it's an uninitialized box, a construct that we have to support anyway because we can move values out from (and back into) a box.

So if we were to implement Box::new using the builtin syntax, it would probably just look like this:

    pub fn new(x: T) -> Self {
        let ptr = Box::into_raw(Box::<T>::new_uninit()).cast::<T>();
        let ret = builtin#shallow_init_box(ptr);
        *ret = ptr;
        ret
    }

@nbdd0121
Copy link
Contributor

#![feature(builtin_syntax)]
#![feature(new_uninit)]
type Type = [u8; 128];
pub fn foo2(f: fn() -> Type) -> Box<Type> {
    let mut b = Box::<Type>::new_uninit();
    unsafe { 
    	builtin#ptr_write(f(), b.as_mut_ptr());
        b.assume_init()
    }
}

What's the semantics of ptr_write here? Does it evaluate f directly into *b? If so, it sounds like this is reintroducing the move_val_init intrinsics, which we get rid of in #80290. cc @RalfJung

@clubby789
Copy link
Contributor

let src = &this.thir[src];
let dst = &this.thir[dst];
let dst = unpack!(block = this.as_place(block, dst));
unpack!(block = this.expr_into_dest(this.tcx.mk_place_deref(dst), block, src));

is the MIR building for this, so probably the same thing

@est31
Copy link
Member Author

est31 commented Apr 24, 2023

If so, it sounds like this is reintroducing the move_val_init intrinsics, which we get rid of in #80290.

Yeah that would be it, more or less. Linux is building a huge macro to support something like builtin#ptr_write: https://twitter.com/LinaAsahi/status/1570119345510182913 (code link) . I'm not sure if it's enough for their use cases though because they want an error if something gets copied onto the stack, not just silently having that copy happen.

I think builtin#ptr_write would be a nice building block for functionality that would allow you to tag functions as accepting some args in-place, after doing some preparatory work to create memory to put those args inside. This could then be used for Vec::push/HashMap::insert/ptr::write/etc, and not just Box::new alone. I don't think it's a good idea to make all of these functions lang items, instead it would be better to have one approach that works for all of these functions.

@est31
Copy link
Member Author

est31 commented Apr 24, 2023

a construct that we have to support anyway because we can move values out from (and back into) a box.

@nbdd0121 what do you mean by that? This is a general property, right? I can't see ShallowInitBox being constructed outside of #[rustc_box] lowering. There is some special casing for boxes in elaborate_drop.rs, but that one just does a .is_box() check to then call open_drop_for_box. It doesn't rely on ShallowInitBox.

@nbdd0121
Copy link
Contributor

If so, it sounds like this is reintroducing the move_val_init intrinsics, which we get rid of in #80290.

Yeah that would be it, more or less. Linux is building a huge macro to support something like builtin#ptr_write: twitter.com/LinaAsahi/status/1570119345510182913 (code link) . I'm not sure if it's enough for their use cases though because they want an error if something gets copied onto the stack, not just silently having that copy happen.

I happen to be very familiar with this ;) Asahi's macro doesn't (and won't) get into the mainline. Instead, we have a more general version of in-place/pinned initialization macros designed by y86-dev and me.

move_val_init is not sufficient for our use case, because (1) as you said, we want a guarantee that copy is not happening, and (2) we need initialization to be fallible, and move_val_init don't have a way to signal Err (in kernel we can't use panicking).

I am familiar with this removed intrinsic precisely because I was exploring it as a potential solution of the in-place initialization problem in Linux kernel, and it didn't work out eventually :(

a construct that we have to support anyway because we can move values out from (and back into) a box.

@nbdd0121 what do you mean by that? This is a general property, right? I can't see ShallowInitBox being constructed outside of #[rustc_box] lowering. There is some special casing for boxes in elaborate_drop.rs, but that one just does a .is_box() check to then call open_drop_for_box. It doesn't rely on ShallowInitBox.

Sorry, by "a construct" I mean the fact the box is uninitialized. I meant that we need to support box being uninitialized regardless whether ShallowInitBox was there.

@petrochenkov
Copy link
Contributor

builtin # foo is specifically for the cases where we cannot fit a feature into an existing syntax, neither into a function call, nor into an attribute, nor into anything else.

The cases described here very much fit into function calls, or other expressions (possibly with attributes) - all the arguments are expressions, you don't need to specify something like a field name as an argument like with offset_of.

I wouldn't want everything that is currently served by e.g. lang items to spread to syntactic level, there's no need for that, it's all about semantics.

@scottmcm
Copy link
Member

Overall box syntax might make sense as builtin#, but I agree that builtin#ptr_write doesn't need to be. It would just lower to *p = move val in MIR, so would be fine as a normal intrinsic without parser logic (like the existing intrinsics::read lowers to target = copy *p in MIR).

@est31
Copy link
Member Author

est31 commented Apr 24, 2023

Instead, we have a more general version of in-place/pinned initialization macros designed by y86-dev and me.

Ohhh that's very interesting. Do you have a link to the macros or maybe any public discussions?

I still don't fully understand why ShallowInitBox is needed for moving out of boxes, because there are ways to create boxes without #[rustc_box] being involved at all, say Box::from_raw/from_raw_in or the safe Box::new_in. Do you say if you move out of an initialized box in those instances, UB is involved?

I've tried using copy_nonoverlapping that ptr::write uses since #80290, and the result looked like the Box::<MaybeUninit<[...]>>::write version (link). I then locally enabled RUSTC_BOOTSTRAP and compared it with a version that uses move_val_init (using rustc 1.50.0; link to code but didnt get it to work in godbolt as I didn't know how to set RUSTC_BOOTSTRAP there). It actually generates code that looks like the box syntax code, so if we reintroduce move_val_init, we might be able to use it to get rid of #[rustc_box] entirely. To me that would seem like a simplification.

The cases described here very much fit into function calls

@petrochenkov thanks for explaining why you are against using #[rustc_box] here. You @scottmcm, as well as the precedent of move_val_init convinced me that there should be no builtin#ptr_write, but instead it should be done via intrinsics. In fact, it would be best to have #[rustc_box] as intrinsic too, but there is no module in alloc for allocation related intrinsics, only one in core, which obviously can't touch allocations.

The other thing to consider is the type_ascribe! macro. I guess it would be best to not migrate that one either to builtin#, but to instead turn it into a proper intrinsic of the signature fn ascribe<T>(v: T) -> T.

@jyn514
Copy link
Member

jyn514 commented Apr 26, 2023

petrochenkov thanks for explaining why you are against using #[rustc_box] here. You scottmcm, as well as the precedent of move_val_init convinced me that there should be no builtin#ptr_write, but instead it should be done via intrinsics. In fact, it would be best to have #[rustc_box] as intrinsic too, but there is no module in alloc for allocation related intrinsics, only one in core, which obviously can't touch allocations.

Does that mean we should close this issue? I'm not sure why this should be an intrinsic - rustc_box doesn't have function-like semantics, the whole point is that Box::new has different semantics than rustc_box.

@jyn514 jyn514 added A-allocators Area: Custom and system allocators T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 26, 2023
@est31
Copy link
Member Author

est31 commented Apr 26, 2023

petrochenkov thanks for explaining why you are against using #[rustc_box] here.

Btw with this, I meant builtin#ptr_write obviously, I mistyped.

Does that mean we should close this issue

I would like to keep it open to discuss the bringing back move_val_init intrinsic and using it instead of #[rustc_box], even though the suggestion from the issue title has been decided against. If you want I can file a new issue/MCP though, so feel free to close.

@jyn514 jyn514 changed the title Moving #[rustc_box] to builtin# Moving #[rustc_box] to move_val_init intrinsic Apr 26, 2023
@est31
Copy link
Member Author

est31 commented Apr 26, 2023

Okay @jyn514 I have updated the issue description with my new proposal. The next step would be maybe to get input from @RalfJung if they think that bringing back move_val_init in favour of getting rid of both #[rustc_box] and ShallowInitBox is a good idea, and if feedback is positive, to make a PR to put it into action. Boxes will always need special casing, but at least one wouldn't need ShallowInitBox any more, which would reduce the amount of special casing that boxes enjoy (at least that's my impression).

@nbdd0121
Copy link
Contributor

I still don't fully understand why ShallowInitBox is needed for moving out of boxes, because there are ways to create boxes without #[rustc_box] being involved at all, say Box::from_raw/from_raw_in or the safe Box::new_in. Do you say if you move out of an initialized box in those instances, UB is involved?

I didn't mean ShallowInitBox is needed for moving out of boxes. I mean that since we support moving out of boxes, it doesn't require a lot of effort to support ShallowInitBox.

There is probably just 20 lines of code needed to support shallow initialized box.

@nbdd0121
Copy link
Contributor

I personally think we should keep shallow initialization of boxes for a good reason: it (in theory), would allow us to initialize a box piece by piece, if we can gain the ability to the same thing for local variables. E.g. this can be helpful:

struct Foo {
    x: String,
    y: String,
}

fn init_foo() -> Box<Foo> {
    let mut b = /* create a shallow initialized box */;
    b.x = "a".to_owned();
    b.y = "b".to_owned();
    b
}

Of course this doesn't work today (for boxes and local variables), but I hope that it eventually would.

There is no way to express something like this if we get rid of shallow init box and replace everything with move_val_init.

@est31
Copy link
Member Author

est31 commented Apr 26, 2023

I think one can absolutely keep around ShallowInitBox, if it's seen as valuable beyond being used inside #[rustc_box]. Right now #[rustc_box], is one additional reason to have ShallowInitBox because adding the equivalent Box<MaybeUninit<[...]>> code in the mir building code is more involved than using + supporting ShallowInitBox.

There is no way to express something like this

You can, by using MaybeUninit, offset_of!, ptr::write, and assume_init (and some manual tracking what to drop). ShallowInitBox looks like a safe version of std::mem::uninitialized to me. At the very least, you'd make the safety checker know that there is no move, read, or function call happening on the uninitialized local b, at least before every field is initialized. You'd have to do analysis on the control flow graph to make sure that every path inititializes all fields, and on every place this happens, one would have to make the "proper" drop function to be called during unwinding instead of dropping the members only.

You'd also have to make sure optimizers don't do any reordering. At this point, even if you only want to support boxes, you'd have a big feature. And also, ideally you'd not just support boxes but also Rc, Arc, and all other types really. It's an interesting topic to discuss, and one should definitely give it a chance, but it's also quite complex and probably requires a lot of design work before one can unleash it on people.

I think if ShallowInitBox is not used by vec![] any more, it makes it easier to do experimentation with it, which can help make an RFC about safe partial initialization of boxes (or of anything whether it's boxes or not).

@RalfJung
Copy link
Member

RalfJung commented May 1, 2023

The next step would be maybe to get input from @RalfJung if they think that bringing back move_val_init in favour of getting rid of both #[rustc_box] and ShallowInitBox is a good idea

#111010 already brought back move_val_init, I think. Is there any question left then?

Having any kind of guarantee that no copy happens is a totally different question though, and would need some serious t-opsem discussion. However for *ptr = val this question doesn't even apply, it is more a question for *ptr = f(), right? How to allow copy elision for function calls is a long-standing question (#71117); fully guaranteeing copy elision is a whole different game. However I would think that in most cases, a hard semantic guarantee is not necessary; it can be a "quality of implementation" issue similar to e.g. hint::black_box. For the cases where a guarantee is needed because address stability matters, some Pin-based solution sounds more robust. I admit I know nothing about what it takes to actually pull that off though. Before any t-opsem discussion, someone would have to produce a good high-level writeup of what problem is being solved and what approaches these macros people mention are currently taking, and what a language-level solution could look like. But I don't know if that is even the goal of this PR.

@est31
Copy link
Member Author

est31 commented May 1, 2023

#111010 already brought back move_val_init, I think. Is there any question left then?

I guess no 😆 . Given that, we can switch vec to use write_via_move then.

@clubby789 you indicated interest higher in the thread, do you want to make a PR? You can also remove #[rustc_box] I guess. I wouldn't remove ShallowInitBox though just yet as @nbdd0121 wants to keep it for future features.

Also thanks for your feedback @RalfJung on the general placement new feature. Regarding guarantee vs no guarantee, I think I agree that different groups of users have different needs. Some might only see it as optimization, and they are served really well by LLVM already, they just want something that's a little bit better. Others are in the situation that if their object lands on the stack, they get a segfault, so ideally there would be a compile error for them if there is a copy happening. I'm not sure how to unify those two needs nicely, or if that should even be a goal. I wold love to be involved in efforts around the design however.

There has also recently been a thread on reddit about the issue, linking to the repo that @nbdd0121 mentioned higher up in the thread.

@RalfJung
Copy link
Member

RalfJung commented May 2, 2023

FWIW ptr.write is just a wrapper around move_val_init/write_via_move, so in many cases using the stable fn should suffice. I guess in some cases inlining happens too late and there is a benefit from calling the intrinsic directly?

@clubby789
Copy link
Contributor

Testing it, using both core::ptr::write and write_via_move seem to have the same behaviour as the current vec![<huge object>], crashing in debug mode due to a large stack allocation and writing directly to the heap in release

                let mut bx = $crate::boxed::Box::new_uninit();
                ::core::intrinsics::write_via_move(bx.as_mut_ptr(), [$($x),+]);
                bx.assume_init()

I'm also not sure how we could use the intrinsic/ptr::write without the user's argument being evaluated inside of the unsafe block

@est31
Copy link
Member Author

est31 commented May 2, 2023

I guess in some cases inlining happens too late and there is a benefit from calling the intrinsic directly?

Either too late or not at all, I don't know. There are definitely some heavily negatively affected benchmarks in the perf testsuite. The last attempt to use a function can be found here, where they tried to use Box::new() whose definition has an #[inline(always)], but that seems to have not been enough. Not 100% sure any more if the problem was around the data the compiler has to process or about the compiler's own performance being affected as it uses vec![] internally.


Testing write_via_move myself, LLVM seems to unify the function with the MaybeUninit::write variant, which also has sub-optimal codegen (or idk maybe it's good but it seems to have a lot of instructions): https://godbolt.org/z/WPbeGzz4h . I added an extra parameter so that this unification is not happening, but unification happens iff the functions have the same parameters and the same body. This means it might not be that helpful to use it... hmmm.

I'm also not sure how we could use the intrinsic/ptr::write without the user's argument being evaluated inside of the unsafe block

That's actually an interesting question: how do you remove some expression argument's ability to use unsafe... I need to make a more thorough search but initially I can't find any construct that can undo unsafe (without it being a closure or such which we obviously want to avoid).

@scottmcm
Copy link
Member

scottmcm commented May 2, 2023

how do you remove some expression argument's ability to use unsafe

One possible hack: control flow tricks.

if false {
    let _a = $expr;
} else {
    unsafe { 
        write_via_move(p, $expr);
    }
}

(Disclaimer: I didn't try it out. Would definitely work with THIR unsafeck, but it's possible that MIR unsafeck is too smart.)

@RalfJung
Copy link
Member

RalfJung commented May 2, 2023

safe blocks have been suggested for this purpose already around or even before 1.0, but never went anywhere.

@est31
Copy link
Member Author

est31 commented May 20, 2023

@scottmcm that's a nice trick!

Given the issue with the codegen of the write_via_move alternative and because builtin # box was rejected, I'll close this issue.

@safinaskar
Copy link
Contributor

safinaskar commented May 29, 2023

@est31

The other thing to consider is the type_ascribe! macro. I guess it would be best to not migrate that one either to builtin#, but to instead turn it into a proper intrinsic of the signature fn ascribe<T>(v: T) -> T.

As well as I understand, this will not work. As well as I understand, the following is true about type ascription:

  • It work not only for value expressions, but also for place expressions
  • It disallows coercions. Only subtyping allowed

As well as I understand creating normal intristic instead of magic builtin# will lose both properties, and thus such ascribe will become simply equivalent to already-existing fn identity<T>(v: T) -> T

@safinaskar
Copy link
Contributor

@scottmcm

One possible hack: control flow tricks

But what if macro (such as vec!) is called inside unsafe block? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-allocators Area: Custom and system allocators T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants