Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add freeze intrinsic and related library functions #3605

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

chorman0773
Copy link

@chorman0773

This comment was marked as resolved.

@saethlin saethlin added T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. T-opsem Relevant to the operational semantics team, which will review and decide on the RFC. labels Apr 2, 2024
@RustyYato
Copy link

RustyYato commented Apr 2, 2024

MaybeUninit::freeze should be unsafe.
Otherwise you could get MaybeUninit::<AnyType>::uninit().freeze() which is trivially unsound.

Thinking about this so late was a bad idea, sorry for the noise. I misread Self as T

@kennytm
Copy link
Member

kennytm commented Apr 2, 2024

MaybeUninit<T>::freeze() returns a MaybeUninit<T> not a T, why would that be unsound

@chorman0773
Copy link
Author

I also do explicitly address an unsafe T returning MaybeUninit::freeze in the rationale and alternatives section and explain why I went for the safe version instead.

@clarfonthey
Copy link
Contributor

clarfonthey commented Apr 2, 2024

Small thing to point out: although it's not exposed publicly, the compiler uses the term "freeze" to refer to things without interior mutability, and that makes the naming of this potentially confusing for those familiar with the term. That said, the existing "freeze" term could always be renamed to something else since it's not stable, but it's worth mentioning anyway.

One thing worth asking here is how this specifically differs from volatile reads, since I can see a lot of similarities between them, and while I understand the differences, a passing viewer might not, and that's worth elaborating a bit more.

Comment on lines +30 to +45
```rust
// in module `core::ptr`
pub unsafe fn read_freeze<T>(ptr: *const T) -> T;

impl<T> *const T{
pub unsafe fn read_freeze(self) -> T;
}
impl<T> *mut T{
pub unsafe fn read_freeze(self) -> T;
}

// in module `core::mem`
impl<T> MaybeUninit<T>{
pub fn freeze(self) -> Self;
}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poking around the library, the obvious parallel to this functionality is zeroed, and these methods don't really parallel that at all.

Namely, zeroed says: give me a value that is initialized with zero bytes. You still have to assume_init to assert that zeroing is valid for your type, but you do know what the bytes are.

To me, freeze here says: initialize a memory location by "freezing" its bytes. Again, you should still have to assume_init to assert that an arbitrary value is valid for your type, but you know that the bytes are intialized.

Another maybe-controversial decision is that I think that these methods should require *mut T, since we're conceptually writing over uninitialized memory and replacing it with an arbitrary value, even though no actual writing is occurring (and thus, no data races can occur). It also allows for things like miri to actually randomize the data in the buffers to help with testing, without worrying about making things unsound.

Maybe there could be some sort of "atomic freeze" that accepts an Ordering, but in general, it seems like you're probably going to want a mutable pointer here.

So, to that end, it feels more like the API should look like this instead:

// in core::ptr
pub fn freeze<T>(ptr: *mut T);

pub unsafe fn read_freeze<T>(ptr: *mut T) -> T {
    freeze(ptr);
    read(ptr)
}

impl<T> *mut T {
    pub fn freeze<T>(self);

    pub unsafe fn read_freeze<T>(self) -> T {
        self.freeze();
        self.read()
    }
}

impl<T> MaybeUninit<T> {
    pub fn freeze(&mut self);

    // and maybe:
    pub fn frozen() -> MaybeUninit<T> {
        let mut uninit = MaybeUninit::uninit();
        uninit.freeze();
        uninit
    }
}

And maybe there could also be helpers for arrays, similar to how copy will also accept a number of elements. That API might mean that these freeze methods for pointers now accept a count indicating an array size.

Copy link
Contributor

@digama0 digama0 Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, freeze here says: initialize a memory location by "freezing" its bytes. Again, you should still have to assume_init to assert that an arbitrary value is valid for your type, but you know that the bytes are intialized.

I was confused about this as well, but reading the rest of the RFC I came to the conclusion that this is supposed to be a read-only operation which does nothing to the original memory location. It returns a "freezed version" of the read bytes but the original memory is left untouched.

The API you are talking about is summarized by MaybeUninit::freeze(&mut self) and is considered in the Alternatives section. (There are reasons not to want to go that route, at least in the first iteration of this API.)

Copy link
Contributor

@digama0 digama0 Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another maybe-controversial decision is that I think that these methods should require *mut T, since we're conceptually writing over uninitialized memory and replacing it with an arbitrary value, even though no actual writing is occurring (and thus, no data races can occur). It also allows for things like miri to actually randomize the data in the buffers to help with testing, without worrying about making things unsound.

No, this is actually literally a write to the target memory, with all of the consequences that follow from that. (No "conceptual" about it, there are cases where you need to actually issue a write instruction, fault on read-only pages, acquire exclusive access to the cache line, the whole nine yards.) The RFC proposes to use *ptr = ptr.read_freeze() as the way to express this operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, although I guess the main controversy here is that it kind of forgets the reason why MaybeUninit exists: in general, it's very bad to pass around values when you're dealing with potentially uninitialized memory.

For example, take this simple use case: you want to read out a struct in full, including padding bytes, which are normally uninitialized. So, let's say we freeze-read that struct, copy it over a buffer of bytes, then write that out.

This is unsound. When you read the struct as a value, the padding is still uninitialized. What you actually want to do is cast the struct to bytes first, then freeze-read that into your buffer.

But again, this is still wrong. You need to cast to a slice of maybe-uninit bytes, since the padding is uninitialized, freeze-read that into a slice of init bytes, and write that.

This is why fn zeroed<T>() -> T is mostly discouraged; if you're working with anything other than integers, it's generally wrong to use. This is why I think that fn read_freeze<T>(ptr: *mut T) -> T is the wrong API; in most cases, the T input and the T output should be different, but here, they're the same, and you have to do a bunch of weird transmutation before you actually get to the thing you want to do.

If we treat freeze as actually working on the memory location, then the freezing and reading can be done separately, with all the necessary transmutes in between. Or, we have a freeze_transmute_copy type thing that freezes and does a transmute_copy afterward.

But ultimately, I feel like freeze returning a value is the wrong choice.

Copy link
Contributor

@digama0 digama0 Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But again, this is still wrong. You need to cast to a slice of maybe-uninit bytes, since the padding is uninitialized, freeze-read that into a slice of init bytes, and write that.

I agree with the first part: if you read a struct T using read_freeze::<T>(_) -> T then you will still not have any padding bytes. But read_freeze::<[u8; size_of::<T>()]>() should work and produce initialized bytes, including for the padding, because (according to the docs in the RFC) the freezing happens before the typed-copy, so it should be safe to do as long as the target type is valid for every initialized bit pattern.

Using by-value instead of by-mut-ref freeze avoids a whole bunch of issues related to "what if we just elided the read?" on weird memory. (This has been discussed at length on #t-opsem.) It also makes it unusable on read-only memory. The API footgun concern is sound, but the by-value MaybeUninit<T> method should alleviate that concern, no?

Copy link
Contributor

@newpavlov newpavlov Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to be thinking of this is terms of permitted optimizations, but that's not a good way to specify a language

Yes, I know. But since I am more of a practitioner, it's easier for me to discuss concrete examples based on potential usecases I have in mind for freeze. Hopefully, then people like you who specialize in language semantics will be able to find a way to bridge what people want and how to specify it robustly.

Does not matter how robust your model is, if it does something surprising or unexpected for people who practice the language (be it it terms of performance or generated insructions), arguably, it's a bad model. You either need to better communicate what the model does, or change the model itself to better satisfy user expectations.

After this discussion, I personally find that the proposed freeze model is hardly useful and that RFC did not do a good job of communicating capabilities of the model. freeze is usually viewed as an optimization technique and the proposed approach may result in very surprising (for low-level programmers) code generation. The RFC needs at the very least to clearly discuss several examples of what can be and can not be done with the model and provide expected code generation for each example. In the current form it does not even mention interaction with MADV_FREE like at all!

One last example from me:

pub fn f() {
    unsafe {
        let mut buf: [u8; 1024] = MaybeUninit::<[u8; 1024]>::uninit().freeze().assume_init();
        extern_read(&mut buf);
    }
}

How much stack space this function would use when compiled with enabled optimizations? ~1024 bytes? ~2048 bytes? More? What exact writes would be generated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much stack space this function would use when compiled with enabled optimizations? ~1024 bytes? ~2048 bytes? More? What exact writes would be generated?

That example should be trivially optimizable to use 1kb of stack. No writes should be necessary since 1kb is less than a page size. If it was more than a page size then it's my understanding that stack probes would already be injected to probe the additional stack pages, irrespective of the use of freeze here.

Copy link
Contributor

@newpavlov newpavlov Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't be so sure. Such buffer can easily cross page boundaries. And even if it does not, you don't know if any writes were generated for this page. What if buf was allocated right at the page edge? You may argue that for stack pages MADV_FREE is not used, but: 1) I can easily imagine a threading library reusing stack frames with MADV_FREE 2) changing freeze behavior depending on whether it's used on stack or heap memory would be... not ideal, to say the least.

So instead of a trivial no-op expected by programmers for this code snippet, we get quite non-trivial behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So instead of trivial no-op expected by programmers for this code snippet, we get quite non-trivial behavior.

Yeah that's what happens when programmers are wrong. They expect things to be simple that aren't simple (see pointer provenance as another example of this). I don't blame them; optimizing compilers and system tricks like MADV_FREE make things complicated in subtle hard-to-predict ways. freeze needs careful documentation to make sure programmers are aware of the caveats.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if buf was allocated right at the page edge?

Then you may need additional probes at the page boundary if the contract between the compiler and the platform does not require that stack reads give deterministic results already. I don't know if such a contract is already defined, but a reasonable option is to simply say that such a threading library is just wrong (in the same way the thread stacks are required to be properly aligned.)

changing freeze behavior depending on whether it's used on stack or heap memory would be... not ideal, to say the least.

The behaviour would be exactly the same, because behaviour is a property of the AM. What instructions are emitted by the compiler for the intrinsic could be completely different in different circumstances though... Obviously? That's what compiler optimizations do!

So instead of a trivial no-op expected by programmers for this code snippet, we get quite non-trivial behavior.

The behaviour is trivial, the emitted code is not. Freeze is not a no-op in the AM - whether it's a no-op in practice depends on optimizations and contracts, in the same way that whether a copy is elided depends on optimisations. If the contract says that stack memory reads are deterministic then it can indeed compile to a no-op. If not then of course the compiler needs to emit a write to that memory to ensure that it becomes initialized.

Comment on lines +83 to +87
`read_volatile` and `read_freeze` are unrelated operations (except insofar as they both `read` from a memory location).
`read_volatile` performs an observable side effect (that compilers aren't allowed to remove), but will otherwise act (mostly) the same as `read`. `read_volatile` does not freeze bytes read.
In contrast, `read_freeze` is not a side effect (thus can be freely optimized by the compiler to anything equivalent).

It is possible in the future that `read_volatile` may carry a guarantee of freezing (non-padding) bytes, but this RFC does not provide that guarantee.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, part of the reason that I mentioned read_volatile is that in many senses, it does freeze the bytes read.

The purpose of read_volatile is to prevent the compiler from assuming that the memory location does not change outside the program, e.g. in the case of things like drivers for hardware that uses direct memory access. However, it also assumes implicitly that the value itself is frozen, since it's not able to read an uninitialized value from that location; it wouldn't be useful if it did that, since that means that any read from that location could potentially be UB based upon the decisions made by some external entity, which doesn't make a lot of sense.

Freezing is still useful for reasons like you said-- volatile reads have side effects that are undesired in most cases where freezing is useful. However, I don't think it's right to say that volatile reads don't freeze the data they read. In fact, I would argue that volatile reads always freeze the data they read.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it currently is, this is not the case. You can observe the difference with miri or msan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so, what exactly does that mean for embedded contexts and the DMA case? It's just always been UB to volatile read from these addresses, and only adding freeze makes it not UB?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably anything outside the AM that's accessing memory outside the AM isn't setting things to uninit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CE link shows that llvm also doesn't think that read_volatile freezes anything: https://godbolt.org/z/6Td6Mhd67

Note that it's discarding the value obtained via read_volatile and not writing it back to x. If it froze the result, then that would be an invalid transformation, because code could observe that the value was frozen (see my note about MADV_FREE making this observable even at the machine level).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I completely misread MaybeUninit::<u8>::uninit().as_ptr() as MaybeUninit::<*const u8>::uninit().assume_init() for some reason, which is why I was confused. The latter would be UB just by itself.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Updated link, also showing that the store is tossed at the llvm level, which means it doesn't think that the load volatile/store pair as being able to change the pointed-to value: https://godbolt.org/z/9Tecb6r7r)

Looks like GCC does the same thing, although ICC and MSVC don't.

I find that very unfortunate.

What makes volatile volatile is that it pierces through Abstract Machine semantics and is defined in terms of assembly-level semantics instead. It has to be defined that way; otherwise it wouldn't make sense to say things like "can't be optimized" or "must be a single load/store instruction". And assembly-level semantics don't have uninitialized bytes.

I'm not saying it's impossible to define volatile in a way where it actually can have uninitialized bytes, justifying the behavior observed here. But it makes the definition more complicated for little benefit. My ideal definition for volatile is essentially "equivalent to an asm! block containing a load/store instruction". There are a few caveats, maybe. The compiler is allowed to reorder volatile accesses with non-volatile ones. The compiler can assume a volatile load/store will actually be executed, unlike an asm! block which can be patched at runtime, although I'm not sure that actually justifies any new optimizations. Regardless, there should only be a few caveats. It should be basically equivalent to an asm! block.

I am going to file a bug report against LLVM to see what they think about this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've tried in the past to make LLVM consider volatile loads as having an implicit freeze, and they wouldn't have it.

But we could codegen read_volatile with a freeze of the loaded value...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like GCC does the same thing, although ICC and MSVC don't.

Hang on a second. ICC and MSVC do both the read and the write, and Clang tosses the write but GCC tosses the read as well. That's worse than "the same thing as Clang", it's a straight up violation of the popular understanding of volatile in C. You should also be filing a bug report against GCC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(FTR, xlang from lccc does spec that both loads and stores with volatile in the access-class do the same thing as freezeing every read/written byte - given gcc's behaviour appears to just be a bug, it seems like we could say Rust volatile loads/stores freeze but don't particularily want to extend the scope of the RFC to modify read_volatile/write_volatile)

`read_volatile` performs an observable side effect (that compilers aren't allowed to remove), but will otherwise act (mostly) the same as `read`. `read_volatile` does not freeze bytes read.
In contrast, `read_freeze` is not a side effect (thus can be freely optimized by the compiler to anything equivalent).

It is possible in the future that `read_volatile` may carry a guarantee of freezing (non-padding) bytes, but this RFC does not provide that guarantee.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it could be mentioned that adding "freezing" semantics to read_volatile is not necessary to semantically perform a "freezing volatile read"; one can read_volatile as MaybeUninit and then by-value freeze that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence alone made me realise just how much I had been misunderstanding the RFC, so, I think it's worth including here at least.

* Either one of the two functions could be provided on their own
* Both functions are provided for maximum flexibility, and can be defined in terms of each other. The author does not believe there is significant drawback to providing both functions instead of just one
* An in-place, mutable `freeze` could be offered, e.g. `MaybeUninit::freeze(&mut self)`
* While this function would seem to be a simple body that llvm could replace with a runtime no-op, in reality it is possible for [virtual memory](https://man7.org/linux/man-pages/man2/madvise.2.html#MADV_FREE) that has been freshly allocated and has not been written to exhibit properties of uninitialized memory (rather than simply being an abstract phenomenon the compiler tracks that disappears at runtime). Thus, such an operation would require a potentially expensive in-place copy. Until such time as an optimized version is available, we should avoid defining the in-place version, and require users to spell it explicitly as `*self = core::mem::replace(&mut self, uninit()).freeze()`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add this, the in-place version also can be implemented in a library crate, that uses inline asm to touch/write to each page at least once ensuring that it is not MADV_FREE.

Without the primitive freeze, there's no corresponding sequence of AM operations that this is equivalent to, but with it, the asm is equivalent to freezing each byte by-hand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate that a bit @thomcc? Like, what inline assembly are you thinking of? Because from your comment it feels like you're proposing both the use of ASM and the new operation.

Copy link

@zachs18 zachs18 Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clarfonthey Inline assembly that does this can be written today, but because the freeze operation does not exist in the abstract machine, that inline asm has... maybe not quite UB, but it performs something the AM does not recognize as a thing that can be done. However, if the freeze operation is added to the AM, that inline asm can be shown to have fully defined behavior in the AM in terms of the freeze operation.

In other words, yes, the ASM and the new stdlib functions would probably both be used (for different use-cases (freezing a large chunk of memory in-place vs freeze-loading a single value, for simple examples)), but semantically both would be defined in terms of the new AM operation.

x86_64 ASM example (Not the most optimized, but gets the point across. Provided AS-IS with no warranty, etc, etc.)
#[inline(never)]
pub fn freeze_in_place(mem: &mut [MaybeUninit<u8>]) -> &mut [u8] {
    let len: usize = mem.len();
    if len > 0 {
        let ptr: *mut MaybeUninit<u8> = mem.as_mut_ptr();
        // touch the first byte of the slice, then
        // touch the first byte in each subsequent page that 
        // contains a byte of the slice.
        unsafe {
            core::arch::asm!(
                "2:",
                "mov {tmp:l}, BYTE PTR [{ptr}]",
                "mov BYTE PTR [{ptr}], {tmp:l}",
                "mov {tmpptr}, {ptr}",
                "and {tmpptr}, -4096", // addr of first byte of this page
                "add {tmpptr}, 4096", // addr of first byte of next page
                "mov {tmplen}, {tmpptr}",
                "sub {tmplen}, {ptr}",
                // tmplen is the number of bytes we semantically froze 
                // with the above `mov BYTE PTR ...`.
                // if this is >= the remaining length of the slice, we're done
                "cmp {tmplen}, {len}",
                "jae 3f",
                // otherwise, keep going
                "sub {len}, {tmplen}",
                "add {ptr}, {tmplen}", 
                "jmp 2b",
                "3:",
                ptr = inout(reg) ptr => _,
                len = inout(reg) len => _,
                tmpptr = out(reg) _,
                tmplen = out(reg) _,
                tmp = out(reg) _,
            );
        }
    }
    // Safety: either the slice is empty, or we touched every page containing the slice
    unsafe {
        &mut *(mem as *mut [_] as *mut [u8])
    }
}

@RalfJung
Copy link
Member

RalfJung commented Apr 4, 2024

Cc @rust-lang/opsem

Comment on lines +30 to +45
```rust
// in module `core::ptr`
pub unsafe fn read_freeze<T>(ptr: *const T) -> T;

impl<T> *const T{
pub unsafe fn read_freeze(self) -> T;
}
impl<T> *mut T{
pub unsafe fn read_freeze(self) -> T;
}

// in module `core::mem`
impl<T> MaybeUninit<T>{
pub fn freeze(self) -> Self;
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there's a "nice" API here that isn't project-safe-transmute

Can you say a bit more about this? I'm the head of PST, and would love gather thoughts of how freeze might impact ST APIs.

Would it make any sense for the intrinsic to have this signature:

pub fn freeze<T>(v: &T) -> [u8; size_of::<T>()];

I think this would avoid some of the safety footguns of the signature defined by the RFC, but I'm not sure if this limits its utility in any obvious (or non obvious) ways.


```rust
// in module `core::ptr`
pub unsafe fn read_freeze<T>(ptr: *const T) -> T;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the safety contract of this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I write it later in the guide-level explanation: Same as core::ptr::read with the caveat of freezing uninit bytes (thus having a weaker initialization invariant on *ptr).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, to me it felt obvious the unsafe here was just because it's a pointer deref, but since the ultimate implementation will have document that anyway, probably worth just writing it here explicitly.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

* Which of the library functions should recieve the direct language intrinsic, between `ptr::read_freeze` and `MaybeUninit::freeze`
Copy link
Member

@jswrenn jswrenn Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A third option is to simply stabilize the intrinsic in-situ, like core::intrinsics::copy.

Copy link
Member

@RalfJung RalfJung Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was accidental (only core::ptr::copy was supposed to be stable but one is a re-export of the other and rustc used to not be able to distinguish re-exports for stability). I don't think we should repeat this mistake. The entire core::intrinsics module is meant to be internal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I don't want any more directly exposed intrinsics.

text/0000-freeze-uninit.md Outdated Show resolved Hide resolved
* Either one of the two functions could be provided on their own
* Both functions are provided for maximum flexibility, and can be defined in terms of each other. The author does not believe there is significant drawback to providing both functions instead of just one
* An in-place, mutable `freeze` could be offered, e.g. `MaybeUninit::freeze(&mut self)`
* While this function would seem to be a simple body that llvm could replace with a runtime no-op, in reality it is possible for [virtual memory](https://man7.org/linux/man-pages/man2/madvise.2.html#MADV_FREE) that has been freshly allocated and has not been written to exhibit properties of uninitialized memory (rather than simply being an abstract phenomenon the compiler tracks that disappears at runtime). Thus, such an operation would require a potentially expensive in-place copy. Until such time as an optimized version is available, we should avoid defining the in-place version, and require users to spell it explicitly as `*self = core::mem::replace(&mut self, uninit()).freeze()`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I totally understand the performance considerations here. What is an "in place" copy? Why is it more expensive than a move?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not more expensive than a move (except as to llvm having more freedom to elide it), but I'm guessing a move looks more like it's "Doing something", whereas a user might assume that MaybeUninit::freeze(&mut self) is really a runtime no-op, and complain it's O(n) on their massive structure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just rewording how I interpreted this, to verify I'm understanding correctly: since virtual memory may be uninitialised but still copy-on-write (for example, from a forked process), there are cases where a copy would still happen even if it's not explicit in the in-place signature, and so it's probably easier to just treat as a read anyway?

This does feel like a case of, optimising for the common case even if it's not always the best, and it might be worth elaborating more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, I personally think that "user isn't capable of reading documentation" is a compelling case for advocating for a particular API. slice::reverse is also an in-place mut method but users don't have issues understanding that it takes linear time. I know there's a better justification, so, I'd personally avoid that kind of argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting on my Project Safe Transmute hat, I'd like to advocate for alternatively or additionally providing an in-place intrinsic. Libraries like zerocopy and bytemuck almost exclusively work with references to values, and those values are potentially DST. The proposed by-value intrinsic does not lend itself to working on DSTs. At the very least, I'd like this RFC to propose a snippet that used the by-value intrinsic to implement a by-ref routine that works on slice DSTs.

That said, the safety considerations of implementing a general, in-place freeze are quite subtle. Zerocopy, for instance, requires that every unsafe block is convincingly proven to be sound with citations to authoritative documentation. I think we're very far from having an abstract machine for Rust that is sufficiently well specified that zerocopy could robustly justify the operation described in #3605 (comment). I would far prefer this fundamental operation to be provided by the standard library, than for it to be re-invented ad hoc by crates.

Co-authored-by: Jack Wrenn <me@jswrenn.com>
impl<T> *const T{
pub unsafe fn read_freeze(self) -> T;
}
impl<T> *mut T{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a method for NonNull too.

* Undefined behaviour does not prevent malicious code from accessing any memory it physically can.


# Rationale and alternatives
Copy link
Member

@jswrenn jswrenn Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative that is being considered by PST (see discussion here) is something like:

/// Initializes possibly-uninitialized bytes in `v` to `0`.
///
/// # Safety
///
/// Unsafe code may depend on uninitialized bytes in `v` being
/// initialized to `0`. Note, however, that the initialization
/// of padding bytes is not preserved on move.
// TODO: Restrict `v` to non-dyn pointers.
fn initialize<T: ?Sized>(v: &mut T) {}

This alternative mitigates some of the listed Drawbacks of this RFC's proposal: Rust would retain the property that sound code does not read uninitialized memory.

The performance cost of this is linear with respect to the number of padding bytes in a type. Since this is less than or equal to the total number of bytes in a type, it may be more performant than a move. For types with no padding, it's free.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To implement initialize as-written (i.e. specifically initializing padding bytes to zero, not just some arbitrary value, and not modifying non-padding bytes), this would require compiler support to know where padding bytes are in an arbitrary type, and as the comment mentions it definitely wouldn't work for dyn Trait in general.

If would also that require compiler support to guarantee "better-than-O(size_of_val)" performance, even for arbitrary initialization instead of zero-initialization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work with dyn Trait (with the ?Sized bound there - though that is mentioned), nor on types that allow uninit bytes that aren't in padding bytes. This does not function for the MaybeUinit<Scalar> case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zachs18:

To implement initialize as-written (i.e. specifically initializing padding bytes to zero, not just some arbitrary value, and not modifying non-padding bytes), this would require compiler support to know where padding bytes are in an arbitrary type, and as the comment mentions it definitely wouldn't work for dyn Trait in general.

The compiler does know where padding bytes are for arbitrary types. That's the entire shtick of Project Safe Transmute, and it's also part of how miri is able to validate transmutations at runtime.

If would also that require compiler support to guarantee "better-than-O(size_of_val)" performance, even for arbitrary initialization instead of zero-initialization.

I don't follow why that would be the case. Both initialize and freeze are O(size_of_val).


@chorman0773:

This doesn't work with dyn Trait (with the ?Sized bound there - though that is mentioned)

As I understand it, neither initialize nor freeze would work on such types. At any rate, I haven't seen any demand from folks doing zero-copy parsing to support dyn Trait. (If anyone reading this has a use-case for that, please let us know!)

nor on types that allow uninit bytes that aren't in padding bytes.

As documented, it would overwrite them. It would probably be useful to provide initialize_padding that only overwrote padding bytes, too.

This does not function for the MaybeUinit<Scalar> case.

Can you say more about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler does know where padding bytes are for arbitrary types. That's the entire shtick of Project Safe Transmute, and it's also part of how miri is able to validate transmutations at runtime.

There's a catch here: for enums, we need to read the discriminant to find out where the padding bytes are. So this can only be done for data where the discriminant is valid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// Unsafe code may depend on uninitialized bytes in `v` being
/// initialized to `0`.

I think that's unimplementable because sometimes LLVM doesn't know if bytes are uninitialized (e.g. calling an external function), so it can't know which bytes to zero. freeze without zeroing works because when LLVM doesn't know if a byte is uninitialized, it just uses the current machine-level value of that byte, which is always implementable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As documented, it would overwrite them. It would probably be useful to provide initialize_padding that only overwrote padding bytes, too.

In order for this to work, either the compiler would have to know and preserve which bytes are uninit (violating the refinement that an uninitialized byte to any initialized value which is not implementable), or it would have to set every byte to 0.

Copy link
Member

@jswrenn jswrenn Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@programmerjake, @chorman0773 The semantics of initialize and freeze are different. The compiler does, in fact, know whether bytes in a type are possibly uninit, because the compiler knows where it has placed padding and union fields. However, the compiler cannot know, statically, whether union-introduced uninit bytes have been overwritten at runtime, and so initialize would either need to unconditionally overwrite union-introduced uninit bytes (initialize), or leave them untouched (initialize_padding).

freeze has the advantage that it preserves the runtime values of union-introduced uninit bytes. However, for the sort of zero-copy (de)serialization done by consumers of bytemuck and zerocopy, this property isn't necessarily important (it's unusual that you'd cast initialized bytes into a MaybeUninit). freeze also has the advantage that it can operate on invalid data.

I don't claim that initialize covers all use-cases of freeze, but I do claim that it satisfies many of the use-cases relevant to Project Safe Transmute, and it does so without some of the drawbacks of freeze:

It's a sufficiently viable and compelling alternative that it deserves some consideration in the Alternatives section of this RFC.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialize isn't really an alternative IMO so much as it is just a different operation. As far as I can tell, initalize is for when you have &mut T (with already valid T), want &mut [u8] preserving the validity of T, are okay with branching code for initializing through enum, and are okay overwriting any and all union bytes.

Without an additional bound that union isn't used, I don't see this being particularly useful, even for safe transmute, although PST is also likely the most likely to be able to enforce that bound. With the bound it's useful for getting to &[u8] which can then be cast further, but in the abstract I can't see any usage that wants this operation not exclusively as a way to bypass using difficult to express "uninit at the same (or more) byte offsets" bounds.

There's no need for initialize to accept ?Sized, either. Just have initialize for sized types and a separate initialize_slice for slices. I suppose that doesn't work for custom slice tail DST, but that's also status quo for most API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll agree that initialize is a completely different operation. It's probably worth at least tangentially mentioning in the Alternatives section, though. Initialize doesn't do a whole lot of good for use cases 2 and 3, though.


Examples of uses:
1. The major use for freeze is to read padding bytes of structs. This can be used for a [generic wrapper around standard atomic types](https://docs.rs/atomic/latest/atomic/struct.Atomic.html).
2. SIMD Code using masks can load a large value by freezing the bytes, doing lanewise arithmetic operations, then doing a masked store of the initialized elements. With additional primitives not specified here, this can allow for efficient partial load operations which prevent logical operations from going out of bounds (such a primitive could be defined to yield uninit for the lane, which could then be frozen).
Copy link
Member

@RalfJung RalfJung Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to rely on us being able to map LLVM poison to Rust "uninit", which is currently not possible.

Same for the next bullet point.

Only the bytes of the return value are frozen. The bytes behind the pointer argument are unmodified.
**The `read_freeze` operation does not freeze any padding bytes of `T` (if any are present), and those are set to uninit after the read as usual.**

`MaybeUninit::freeze` is a safe, by-value version of `read_freeze`. It takes in a `MaybeUninit` and yields an initialized value, which is either exactly `self` if it is already initialized, or some arbitrary value if it is not.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth noting that the term "initialized" here refers to the notion of an Abstract Byte in memory being initialized. This is not to be confused with the notion of MaybeUninit being "initialized" in the sense that assume_init() is safe to call. The text says that MaybeUninit::<bool>::uninit().freeze() is initialized but it is obviously not safe to call assume_init on this.

# Future possibilities
[future-possibilities]: #future-possibilities

* With the project-portable-simd, the `Simd` type could support `Simd::load_partial<T, const N: usize>(x: *const T) -> Simd<[MaybeUninit<T>;N]>` (signature to be bikeshed) which could then be frozen lanewise into `Simd<[T;N]>`. With proper design (which is not the subject of this RFC), this could allow optimized loads at allocation boundaries by allowing operations that may physically perform an out-of-bounds read, but instead logically returns uninit for the out-of-bounds portion. This can be used to write an optimized implementation of `memchr`, `strchr`, or `strlen`, or even optimize `UTF-8` encoding and processing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relies on a hypothetical new LLVM operation that does a load where out-of-bounds memory is undef, right? That should be mentioned.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or an appropriate masked/partial load that doesn't logically perform the problematic reads at the llvm level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text does not indicate a way for portable-simd to have the information to produce such a mask.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use an expository signature of

Simd::read_select(
    source: *const [T],
    enable: Mask<isize, N>,
) -> Simd<MaybeUninit<T>, N>

This is deliberately imitating the signature of gather_select and gather_select_ptr. While likely not the "correct" signature for a partial read, it can clearly do what the example usage wants to do.

Copy link
Member

@RalfJung RalfJung Apr 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_select is likely closer to the intended primitive.

text/0000-freeze-uninit.md Outdated Show resolved Hide resolved
# Future possibilities
[future-possibilities]: #future-possibilities

* With the project-portable-simd, the `Simd` type could support `Simd::load_partial<T, const N: usize>(x: *const T) -> Simd<[MaybeUninit<T>;N]>` (signature to be bikeshed) which could then be frozen lanewise into `Simd<[T;N]>`. With proper design (which is not the subject of this RFC), this could allow optimized loads at allocation boundaries by allowing operations that may physically perform an out-of-bounds read, but instead logically returns uninit for the out-of-bounds portion. This can be used to write an optimized implementation of `memchr`, `strchr`, or `strlen`, or even optimize `UTF-8` encoding and processing.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use an expository signature of

Simd::read_select(
    source: *const [T],
    enable: Mask<isize, N>,
) -> Simd<MaybeUninit<T>, N>

This is deliberately imitating the signature of gather_select and gather_select_ptr. While likely not the "correct" signature for a partial read, it can clearly do what the example usage wants to do.

@JarredAllen
Copy link

JarredAllen commented Apr 6, 2024

An extra function related to this that I've often found myself wishing existed (name unimportant and can be changed):

impl<const N: usize> [u8; N] {
    pub fn new_freeze() -> Self {
        unsafe { MaybeUninit::<Self>::new().freeze().assume_init() }
    }
}

I mostly find this useful for functions on std::io::Read which require an initialized &mut [u8], for which there's no reason to write a value just for it to be overwritten.

It's not that important, since it's easy to implement on top of the functions provided here, but I think a safe function in the standard library for this case would be nice.

Co-authored-by: Ralf Jung <post@ralfj.de>
@chorman0773
Copy link
Author

I can agree with a documentation-level warning along the lines of "bytes obtained by freezing uninitialized memory can be previous contents from old allocations that happen to overlap, which may include cryptographic secrets. It is generally insecure to transmit frozen bytes to an untrusted 3rd party, such as over a network, unless steps are taken to prevent accidental or malicious exfiltration of secure data"

@ijackson
Copy link

bytes obtained by freezing uninitialized memory can be previous contents from old allocations that happen to overlap,

I think we have a real disagreement about the semantics here. Based on my understanding of the (non)-guarantees offered by freeze, I think the above text radically understates the problem.

My understanding of the semantics is, as @RalfJung puts it:

And as the saying goes -- when considering the security of programs that do non-deterministic operations, you should assume the result to be either random or your secret key, depending on what is worse.

So specifically, the bytes you get out of freeze are not limited to previous contents of now-released allocations (whether on the stack or the heap). They might be from anywhere, including arbitrary other parts of memory. So, indeed, they might be magically transported from memory being concurrently accessed by other threads. They might come from live mutable borrows that you couldn't deliberately access without UB. etc.

Spectre makes all this more likely and/or worse (in a way that's super confusing to reason about).

If my understanding of the semantics are wrong, then there are some guarantees. But what are they? The documentation needs to clearly state those guarantees, as positive statements, in a way that means that the compiler authors (including LLVM experts) can verify that what we're promising is true.

If my understanding is right, then these functions are extremely dangerous and that needs to be stated much more alarmingly in the docs. Giving mild examples of what might happen is very misleading, when the scope of possible behaviiours is so wide, and the worst cases so much worse than the examples.

@chorman0773
Copy link
Author

chorman0773 commented Apr 11, 2024

So specifically, the bytes you get out of freeze are not limited to previous contents of now-released allocations (whether on the stack or the heap). They might be from anywhere, including arbitrary other parts of memory. So, indeed, they might be magically transported from memory being concurrently accessed by other threads. They might come from live mutable borrows that you couldn't deliberately access without UB. etc.

They can come from any of these places, but I'd expect one of the following from any non-malicious compiler:

  • Some fixed constant after folding a freeze undef/freeze uninit (or zero-initing memory/registers),
  • Stale contents of the memory being read,
  • Stale contents of some register, after a read is elided,
  • Trap, because you read as a type that doesn't allow all initialized byte sequences (like bool) and the compiler helpfully chose a byte sequence the type doesn't allow.

I'd generally not expect a well-behaved compiler to yank memory from elsewhere in the program. While it theoretically could, I'm not sure there's a security reason to mention more than those cases that isn't caused by "Your compiler sucks and you should file a QoI issue to their repo".

It's up there with the stories we tell people about compilers formatting your hard drive on UB - they're within their rights to do so, and you shouldn't be writing UB-causing code, but if the compiler actually invented a call to Command::new("mkfs.ext4").arg("/dev/sda1").spawn(), I'd be within my rights to file an issue and never use that compiler again.

@programmerjake
Copy link
Member

programmerjake commented Apr 11, 2024

  • Trap, because you read as a type that doesn't allow all initialized byte sequences (like bool) and the compiler helpfully chose a byte sequence the type doesn't allow.

that one is just plain UB, so the compiler could reasonably delete all code that must reach that point in the program, just like unreachable_unchecked.

e.g.: https://llvm.godbolt.org/z/qMqcMbzr4

@chorman0773
Copy link
Author

Yeah, that is indeed just a misspelling of core::hint::unreachable_unchecked(), and has the same implications.

@oskgo
Copy link

oskgo commented Apr 12, 2024

It is possibly to use freeze to build a sound and safe API that can expose secrets. I think this may be an issue because programmers writing safe Rust today can assume this doesn't happen (barring bugs in unsafe code).

Currently the easiest way to avoid leaking any sensitive data is to not have access to it. With the addition of freeze they also have to make sure they're not using an API that outputs raw frozen data.

One could imagine an API for zero copy serialization that uses freeze to serialize padded data. This would be fine to use for local storage or transfer between nodes of equal trust, but could be disastrous if used to send data between computers with different privileges.

@Diggsey
Copy link
Contributor

Diggsey commented Apr 12, 2024

It is possibly to use freeze to build a sound and safe API that can expose secrets.

You would need to use unsafe internally to do that, but yes you could expose a safe API that exposes secrets. As already discussed though, you don't need freeze to be able to do that, so it's not really relevant: there are any number of ways to do that already, the simplest being to just do anything that's UB - since when UB occurs the compiler is allowed to do anything, including leaking secrets.

Rust is a systems programming language, that means (and rightly so...) that you can use unsafe code to do just about anything that is possible.

There is a related question which is worth answering though: should it be acceptable for a safe API to expose "frozen" data to consumers of that API, and the answer to that is probably "no", since reading uninitialized memory could be considered not memory safe. This is purely a documentation question though, it doesn't affect the API proposed here, since this API offers no way to safely get at the uninitialized data.

@RalfJung
Copy link
Member

RalfJung commented Apr 12, 2024

You would need to use unsafe internally to do that, but yes you could expose a safe API that exposes secrets. As already discussed though, you don't need freeze to be able to do that, so it's not really relevant: there are any number of ways to do that already, the simplest being to just do anything that's UB - since when UB occurs the compiler is allowed to do anything, including leaking secrets.

There is no existing way to do that without causing UB. So I don't think "there are any number of ways" is accurate. We have a very well-established norm that UB is not okay and we have tools to help us detect UB. So the risk that a library will leak contents of unrelated memory right now is fairly contained.

This RFC makes that risk a lot bigger by making it possible to write sound code that passes all sanitizers and yet still leaks memory contents. The social norm of "though shall not cause UB" is no longer sufficient to defend against these kinds of leaks.

So I agree that every function that leaks the result of a freeze should have a very clear warning that it does so, probably linking to some central docs in the standard library that explain the background. The key point here is to be actionable, so e.g. make it clear how that data should be treated. Even with the point added in drawbacks, this point does not get enough attention in the RFC. For instance, the guide-level explanation should give guidance on how to manage and contain the risk of data leakage.

However, I fail to see any connection to Spectre. With Spectre one can already have leaks via timing in existing entirely sound code; I don't see how things would become any worse with freeze, nor which actionable advice to give to reduce the risk. Spectre is a hardware bug and there's little that programming languages can do here; if there is some sort of Spectre mitigation that can be applied in Rust then that discussion seems largely unrelated to this RFC.

```

# Drawbacks
[drawbacks]: #drawbacks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another drawback that should be discussed is that this will lead to valgrind errors in sound Rust code. Ideally there is some plan for how to enable valgrind to still detect UB due to use of uninit memory (it would be bad to lose this capability) while accepting code that correctly uses freeze.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would imagine Rust can have a cfg that makes freeze ops be reported to valgrind, so valgrind can properly handle it once that feature is implemented in valgrind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure valgrind already has false positives in sound Rust code because it doesn't understand MaybeUninit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of that?

@digama0
Copy link
Contributor

digama0 commented Apr 12, 2024

You would need to use unsafe internally to do that, but yes you could expose a safe API that exposes secrets. As already discussed though, you don't need freeze to be able to do that, so it's not really relevant: there are any number of ways to do that already, the simplest being to just do anything that's UB - since when UB occurs the compiler is allowed to do anything, including leaking secrets.

There is no existing way to do that without causing UB. So I don't think "there are any number of ways" is accurate. We have a very well-established norm that UB is not okay and we have tools to help us detect UB. So the risk that a library will leak contents of unrelated memory right now is fairly contained.

There are certainly many ways to have a safe API that exposes secrets. You can just fail to sanitize your arguments appropriately, have a backdoor or malware, or loads of other things. unsafe is not a security boundary. Rust protects against memory unsafety with the unsafe keyword, but there are many ways safe code can be insecure without having UB or memory unsafety in them. Passing nondeterministic information (or just regular secrets) to untrusted parties is an example of insecure behavior, not unsafe behavior.

@oskgo
Copy link

oskgo commented Apr 12, 2024

@digama0 Not just any secrets, but secrets you don't know about and can't currently access with safe code.

Rust user should know that UB, or more generally unsoundness, is a security hazard no matter where it occurs. We have strong norms about this. Even though the safe/unsafe boundary is not a security boundary we expect more care when using unsafe and treat bugs in unsafe code as much more critical. There's even a language feature to help make the distinction. We could instead have gone with the Swift route and made unsafety only be mentioned in documentation and naming conventions.

Obviously there's a reason for this. UB is (at least in theory) at least as bad as calling into malware. In practice we care about this because UB can cause data corruption, allow remote code execution and give access to any secrets on the machine not protected by a higher privilege than the program.

With freeze we poke a hole in this. Now safe code has to worry about this as well, not just their own secrets, which if you structure your code well will often be none.

Consider the zero copy serialization library I mentioned earlier. In my view that's a perfectly legitimate use of freeze. We can hope that we'll have just as strong norms that frozen values should never be exposed, but I'm doubtful. If such a library is accepted and gets to expose a safe API then misuse of the API is very dangerous with nonlocal effects similar to UB, even if they document that well. Imagine using it on config stored locally at first and later adding a feature to share that config with others, for example.

@digama0
Copy link
Contributor

digama0 commented Apr 12, 2024

This is not the first use of nondeterminism in Rust by a long shot. The allocator is nondeterministic, so for all you know it is encoding all your secrets into the pointer values, and if you ever use {:p} printing you might be exfiltrating something without realizing. I don't think there is any reason to be any more alarmist about this nondeterminism than any other kind of nondeterminism in the opsem.

@RalfJung
Copy link
Member

RalfJung commented Apr 12, 2024

It is much more likely that secrets will be leaked by dumping the contents of uninit memory (via freeze) than via the addresses of the pointers returned by the allocator. Your claim that data could be exfiltrated by the allocator is rather far-fetched -- can you even construct an example where that actually could realistically happen?

We're not being "alarmist" but we are raising valid concerns that the RFC should address, and that are quantitatively if not qualitatively different from what exists in Rust today.

@RalfJung
Copy link
Member

This is not at all a new concern either, see e.g. #837. The new RFC should probably reference that old proposal and engage with the arguments raised there.

@oskgo
Copy link

oskgo commented Apr 12, 2024

It's already been mentioned that a non-malicious compiler should do one of four things when freezing unit data, and two of those include getting stale data.

I don't expect a non-malicious allocator and compiler pair to choose adresses based on secrets.

I would be much less worried if a secure version of this such as initialize existed already, and we could put in bold on top of the documentation for freeze something like: "99% of use cases should use initialize instead`, especially if you're possibly exposing frozen values in safe APIs"

@VitWW
Copy link

VitWW commented Apr 14, 2024

  1. Freeze function is effective tool to prevent double initializing (for example, for custom allocators).
  2. Freeze representation does not UB for types with "total" representation, when any value of a memory is a valid value of a given type.
    As I know, not all types are safe to have freeze representation.
    But some types definitely are "freezeable", like numbers, arrays of "freezeable" types.

@ryanavella
Copy link

You would need to use unsafe internally to do that, but yes you could expose a safe API that exposes secrets. As already discussed though, you don't need freeze to be able to do that, so it's not really relevant: there are any number of ways to do that already, the simplest being to just do anything that's UB - since when UB occurs the compiler is allowed to do anything, including leaking secrets.

There is no existing way to do that without causing UB.

I have an application which does this intentionally by separately compiling and linking asm. No lto of course.

I assumed this "hack" is sound because the reads happen entirely outside of the Rust abstract machine, i.e. in the assembly I'm abiding by an entirely different abstract machine's rules. Are you suggesting otherwise?

@chorman0773
Copy link
Author

chorman0773 commented Apr 18, 2024

FFI calls cannot perform operations on the Rust AM's state that don't correspond to valid AM operations (for example, it can't write through a Frozen or Disabled tag).
That being said, there is indeed nothing that prevents the FFI function from pulling the value it returns from Thin Air.

@RalfJung
Copy link
Member

RalfJung commented Apr 18, 2024

I assumed this "hack" is sound because the reads happen entirely outside of the Rust abstract machine, i.e. in the assembly I'm abiding by an entirely different abstract machine's rules. Are you suggesting otherwise?

You can't just freely compose two AMs. Insofar as the AMs access or change each other's state, they can only do that in ways that the other AM would already permit anyway. That is required to ensure soundness of whatever optimizations the compiler performs based on one AM (not knowing anything about the other AM).

So as long as Rust doesn't have freeze, doing it via inline assembly is very questionable at best. Freeze needs to inspect AM state in ways that the AM does not permit ("is this byte initialized"), which IMO makes implementing freeze via inline asm unsound. But I admit I can't construct a miscompilation so maybe this is some sort of barely sound gray area. Given the concerns around leaking secrets, I think it should definitely be treated as something Rust does not intend to be possible, until there was a decision to permit it (i.e., until this RFC gets accepted).

@ryanavella
Copy link

(you mentioned inline assembly a few times, and I just wanted to clarify I am talking about externally linked assembly)

From the assembly AM's perspective there is no such thing as uninitialized memory, so lto passes can't just freely miscompile unless if the system linker is somehow aware of the specifics of Rust's AM.

I'd be curious if your position would change if it was primarily assembly calling into Rust?

@RalfJung
Copy link
Member

RalfJung commented Apr 18, 2024 via email

@Diggsey
Copy link
Contributor

Diggsey commented Apr 18, 2024

So as long as Rust doesn't have freeze, doing it via inline assembly is very questionable at best. Freeze needs to inspect AM state in ways that the AM does not permit ("is this byte initialized"), which IMO makes implementing freeze via inline asm unsound.

I don't agree - ultimately both AMs must somehow map their states onto the real hardware state. If an operation is sound in one AM and doesn't modify any state that is visible to the other AM, then it doesn't matter what rules that second AM has, the operation cannot be unsound. It's simply outside the jurisdiction of the other AM.

That's the case here - ASM doesn't have a concept of uninitialized memory, so the read operation is sound as far as ASM is concerned. The read operation also doesn't modify any real hardware state that is visible to Rust. Therefore the operation is sound.

@RalfJung
Copy link
Member

RalfJung commented Apr 18, 2024

If an operation is sound in one AM and doesn't modify any state that is visible to the other AM, then it doesn't matter what rules that second AM has, the operation cannot be unsound.

That's not necessarily true. An AM can have properties of indistinguishably, where two different AM states are known to be equivalent for all observers that run inside the AM. If now another AM performs a transition that can only be explained by telling apart these supposedly indistinguishable states, that could lead to unsoundness if the compiler assumes that telling them apart is impossible.

After all, freeze is crucially not a NOP in the Rust AM. It doesn't modify hardware state (leaving aside MADV_FREE for a second), but it does inspect and modify AM state. It is this transition on AM state that must be argued to be sound. This is obviously the case if the transition is possible by regular means inside the AM. But for freeze, that obvious argument does not work. A very non-obvious argument may be possible, but it is a lot more complicated than "the hardware state doesn't change" -- in particular, it necessarily has to involve reasoning about the fact that the AM is "monotone" wrt initialization.

@Diggsey
Copy link
Contributor

Diggsey commented Apr 18, 2024

that could lead to unsoundness if the compiler assumes that telling them apart is impossible.

Firstly, that's only the case if the ASM returns that information to the part of the program running within the AM. The read alone is still sound regardless of what the AM says.

Secondly, in the case of the Rust AM, it would be non-sensical for a Rust compiler to be allowed to make that assumption, since in practice you can always distinguish hardware states (such as via timing side-channels, OS calls, etc.)

@RalfJung
Copy link
Member

RalfJung commented Apr 19, 2024

Firstly, that's only the case if the ASM returns that information to the part of the program running within the AM. The read alone is still sound regardless of what the AM says.

No. If the compiler considers the states equivalent it will freely change the program from one to the other, making it unsound for any part of the program (in any AM) to observe it.

Secondly, in the case of the Rust AM, it would be non-sensical for a Rust compiler to be allowed to make that assumption, since in practice you can always distinguish hardware states (such as via timing side-channels, OS calls, etc.)

Timing is considered not observable. And you cannot se the hardware state to distinguish AM states that map to the same hardware state.

@programmerjake
Copy link
Member

programmerjake commented Apr 19, 2024

what about the argument that, with an asm read, if it could read uninit then we just know it returned some non-uninit value, we don't know what -- this allows the AM to have whatever undef-or-not shenanigans it likes since those are just different ways of cooking up a unknown value. if it can't read uninit then the asm must return the value actually in memory, so is perfectly defined.
afaict this is sound as long as we don't tell the compiler it can freely duplicate the asm and expect the values to be the same.

@RalfJung
Copy link
Member

RalfJung commented Apr 19, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. T-opsem Relevant to the operational semantics team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet