Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add getrandom::array() #293

Closed
wants to merge 1 commit into from

Conversation

briansmith
Copy link
Contributor

@briansmith briansmith commented Oct 19, 2022

Implement getrandom_array. It requires Rust 1.51 and the user must enable the "array" feature explicitly.

@josephlr
Copy link
Member

josephlr commented Oct 20, 2022

3 early comments on this PR:

  1. We should not add a rust-1-55 feature to support later versions of Rust.
    Instead, we should use a crate like rustversion and mark the method like:

    #[inline]
    #[rustversion::since(1.51)]
    pub fn getrandom_array<T: Array>() -> Result<T, Error> {
        // ...
    }

    Then we can add a #[doc(cfg(...))] indicating that it is only available on Rust versions 1.51 and later.

  2. I think the implementation of getrandom_array can be cleaner.
    We could just add a polyfill for MaybeUninit::as_bytes_mut (like we do for other methods):

    #[inline(always)]
    fn uninit_as_bytes_mut<T>(t: &mut MaybeUninit<T>) -> &mut [MaybeUninit<u8>] {
        unsafe {
            slice::from_raw_parts_mut(
                t.as_mut_ptr() as *mut MaybeUninit<u8>,
                mem::size_of::<T>(),
            )
        }
    }

    Our implementation would then be quite simple:

    #[inline]
    pub fn getrandom_array<T: Array>() -> Result<T, Error> {
        let mut uninit = MaybeUninit::<T>::uninit();
        getrandom_uninit(uninit_as_bytes_mut(&mut uninit))?;
        Ok(unsafe { uninit.assume_init() })
    }

    We also don't need inline(always) (see below).

  3. Is support for returning nested byte arrays really necessary?
    If getrandom_array could only return [u8; N], we could still do something like:

    pub fn get_ints() -> [u64; 8] {
        let arr: [u8; 64] = getrandom_array().unwrap();
        let mut ints = [0u64; 8];
        for (i, chunk) in arr.chunks_exact(8).enumerate() {
            let arr_chunk: &[u8; 8] = chunk.try_into().unwrap();
            ints[i] = u64::from_ne_bytes(*arr_chunk);
        }
        ints
    }

    or if we use feature(array_chunks):

    pub fn get_ints() -> [u64; 8] {
        let arr: [u8; 64] = getrandom_array().unwrap();
        let mut ints = [0u64; 8];
        for (i, chunk) in arr.array_chunks::<8>().enumerate() {
            ints[i] = u64::from_ne_bytes(*chunk);
        }
        ints
    }

    Of course, if we did allow returning nested arrays, the code can be shorter:

    pub fn get_ints() -> [u64; 8] {
        let arr: [[u8; 8]; 8] = getrandom_array().unwrap();
        arr.map(u64::from_ne_bytes)
    }

    I'm not sure that the niche use case of nested arrays is worth expanding the API for this crate.

I put my above proposed implementation in this Godbolt Example. All three implementations of get_ints() compile to roughly the same code, and inline(always) isn't needed.

@briansmith
Copy link
Contributor Author

  1. We should not add a rust-1-55 feature to support later versions of Rust.
    Instead, we should use a crate like rustversion and mark the method like:

rustversion is very nicely implemented but the fundamental problem with rustversion is that it uses a build script, which is a very heavyweight mechanism compared to adding a feature flag to this crate.

  1. I think the implementation of getrandom_array can be cleaner. We could just add a polyfill for MaybeUninit::as_bytes_mut (like we do for other methods):

Maybe. At a minimum, such a function needs to be marked as unsafe fn since it isn't actually safe to cast from any &mut MaybeUninit<T> to &mut [MaybeUninit<u8>] since not every type T is safely transmutable from arbitrary bytes. I tried to write a specialized version for (arrays of) arrays of bytes but ran into const generics limitations. Ultimately inlining the implementation into getrandom_array was (is?) the minimally-unsafe (minimally powerful) thing to do presently.

I have some other ideas for a simpler implementation using transmute but I haven't tried it yet.

  1. Is support for returning nested byte arrays really necessary?

The goal isn't really to support nested byte arrays, but to make it easy to construct arrays of other types in a way that doesn't require us to manually add support to new types on a regular basis.

In the case of ECC we need random [Limb; N] arrays during key and nonce generation. Depending on the implementation of RSA keygen and/or some other blinding operations we need similar. Depending on the implementation hash functions, HMACs, and similar need similar.

If getrandom_array could only return [u8; N], we could still do something like:

pub fn get_ints() -> [u64; 8] {

In most cases I imagine one will want to write a random_limbs<T, const N>() -> Result<[Limb; N], Error> function or similar. Last time I tried I ran into the need to use generic_const_exprs which is not only unstable, but which the compiler warns "the feature generic_const_exprs is incomplete and may not be safe to use and/or cause compiler crashes."

array_chunks is a nonstarter for the kinds of things I'm trying to support since they want to work on stable Rust. It doesn't look like array_chunks is going to be stable any time soon.

So, the solution I proposed is ugly but AFAIK it's the prettiest and most minimal solution that doesn't force the user to write ugly, tedious code using chunks_exact (especially people who review the code I'm writing are keen to statically avoid unwrap()) or to use unstable features and/or unsafe features.

Of course, if we did allow returning nested arrays, the code can be shorter:

pub fn get_ints() -> [u64; 8] {
    let arr: [[u8; 8]; 8] = getrandom_array().unwrap();
    arr.map(u64::from_ne_bytes)
}

It would often become one line that avoids the need to define a function: let random_limbs: [Limb; 8] = getrandom_array()?.map(Limb::from_ne_bytes).

I put my above proposed implementation in this Godbolt Example. All three implementations of get_ints() compile to roughly the same code, and inline(always) isn't needed.

It may be the case that #[inline(always)] isn't needed. However, to verify that, we'd better define getrandom_array in a separate crate from get_ints() and investigate the (lack of) cross-crate inlining at various optimization levels.

@newpavlov
Copy link
Member

newpavlov commented Oct 20, 2022

  1. We should not add a rust-1-55 feature to support later versions of Rust.

rustversion not only uses build scripts, but also introduces an additional dependency. Some users are quite sensitive to size of their dependency tree. Also, personally, I find compiler version dependent cfgs a bit too surprising for my taste.

I think it's better to use a different, more descriptive feature name. getrandom_array is the most obvious opition.

  1. I think the implementation of getrandom_array can be cleaner.

I think the current implementation is clean enough, though it could be worth to find a better name for TransmuteFromArbitraryBytes.

But I do not understand why the following approach was not used for the trait (implementation of the function stays the same):

// The trait name is just a brief placeholder

pub unsafe trait Pod {}

unsafe impl Pod for u8 {}
unsafe impl Pod for u16 {}
unsafe impl Pod for u32 {}
unsafe impl Pod for u64 {}
unsafe impl Pod for u128 {}
unsafe impl Pod for usize {}

// It's safe to write into padding bytes, so this impl correct for all types,
// including with alignment bigger than size.
unsafe impl<T: Pod, const N: usize> Pod for [T; N] {}

It should automatically provide support for [Limb; N] and arbitrary nested arrays. Though I think the latter will be used extremely rarely in practice. If users will request it, later we can add other Pod impls for core types (e.g. signed integers, Wrapping<T>, SIMD types, etc.)

All three implementations of get_ints() compile to roughly the same code, and inline(always) isn't needed.

I think it's better to use #[inline(always)] just to be safe, otherwise Rust may compile code into binary with redundant stack uses.

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@josephlr
Copy link
Member

If users will request it, later we can add other Pod impls for core types (e.g. signed integers, Wrapping<T>, SIMD types, etc.)

This is looking very similar to zerocopy's FromBytes trait. If we ended up doing something like this, it might make more sense to either:

  • Have zerocopy as an optional dependency of getrandom which then enables this function.
  • Have getrandom as an optional dependency of zerocopy, and we add FromBytes::new_random to the zerocopy crate.

I would prefer the second approach to the first, but both approaches also solve the issue of needing something like rustversion, as users would just use reasonably named features to opt-in to the functionality.

If we just want something standalone for this crate, I think we should only deal with arrays of u8 (maybe supporting nested arrays). Dealing with other non-u8 types seems unnecessary when those types already have a way to convert to/from arrays of u8 (as @briansmith showed above).

@josephlr
Copy link
Member

I think it's better to use a different, more descriptive feature name. getrandom_array is the most obvious opition.

I would be fine with an opt-in feature like this. We would then just document that the MSRV for this feature is higher than the crate's default MSRV.

@josephlr
Copy link
Member

I think it's better to use #[inline(always)] just to be safe, otherwise Rust may compile code into binary with redundant stack uses.

I think the general advice for Rust is to prefer #[inline] to #[inline(always)]. For example, if someone is compiling with opt-level=s and calling getrandom_array with the same type in multiple places, they may not want to inline the function. However, once we agree on an implementation, we should actually compile some example crates (not just on Godbolt), and see if there are actually any issues with #[inline].

@josephlr
Copy link
Member

josephlr commented Oct 20, 2022

It would often become one line that avoids the need to define a function: let random_limbs: [Limb; 8] = getrandom_array()?.map(Limb::from_ne_bytes).

It would be awesome if we could get something like this to work. When I did:

let x: [u64; 8] = getrandom_array()?.map(u64::from_ne_bytes);

I kept getting a type annotations needed error.

EDIT: doing something like:

pub unsafe trait Bytes {}
unsafe impl Bytes for u8 {}
unsafe impl<B: Bytes, const N: usize> Bytes for [B; N] {}

pub fn getrandom_array<B: Bytes, const N: usize>() -> Result<[B; N], Error> { ... }

allows the above code to compile without any type annotations.

@briansmith briansmith changed the title Add getrandom_arrays and the rust-1-55 feature. Add getrandom_arrays Oct 21, 2022
@briansmith briansmith changed the title Add getrandom_arrays Add getrandom_array Oct 21, 2022
@briansmith briansmith force-pushed the b/uninit-array branch 3 times, most recently from 07a9d4f to ad0af77 Compare October 21, 2022 05:49
@briansmith
Copy link
Contributor Author

Regarding the naming of getrandom_array (and getrandom_uninit): Rust naming conventions are to avoid prefixing item names with the module/crate name. So perhaps these functions should be array() (and fill_uninit?).

I clean up the draft a bit to take into account the suggestions above. I do not think it is a good idea to try to maintain a list of types to support; I'd rather not maintain a subset of zerocopy/bytemuck. Maybe we can expand the types supported when the Safe Transmute Project is further along.

@briansmith briansmith force-pushed the b/uninit-array branch 2 times, most recently from 56d3282 to 11181db Compare October 21, 2022 05:55
@newpavlov
Copy link
Member

newpavlov commented Oct 21, 2022

Regarding the naming of getrandom_array (and getrandom_uninit): Rust naming conventions are to avoid prefixing item names with the module/crate name. So perhaps these functions should be array() (and fill_uninit?).

Personally, I view them as variants of the getrandom function. But agree that getrandom::array() will look better. But would you suggest to potentially rename getrandom to fill? We could potentially have both until the next breaking release and mark the former as #[deprecated].

@notgull
Copy link

notgull commented Oct 21, 2022

Personally, I think this is out of scope for this crate. I think that it would be more idiomatic for this method to just be implemented as a method for OsRng in the rand crate. Especially given the API sprawl involved in the current implementation (the ArrayItem trait), I think we should limit this crate's method to pure number generation and leave higher-level operations to higher-level crates.

@newpavlov
Copy link
Member

@notgull
The main reason to support array generation is to make the main use case of getrandom easier and safer, namely: seed generation. Right now rand_core and similar crates have to use unsafe code, while with the new function they will be able to simply call u64::from_ne_bytes(getrandom::array()).

Implement `getrandom::array`. It requires Rust 1.51 and the user must
enable the "array" feature explicitly.
@briansmith briansmith changed the title Add getrandom_array Add getrandom::array() Oct 29, 2022
This was referenced Mar 25, 2023
@Lokathor
Copy link

Lokathor commented Sep 1, 2023

I know this is an old PR, perhaps bit-rotted

At a minimum, such a function needs to be marked as unsafe fn since it isn't actually safe to cast from any &mut MaybeUninit to &mut [MaybeUninit]

Actually this is allowed specifically because of the MaybeUninit. Any of the bytes can be uninit, which is why getting the T out of the MaybeUninit wrapper is itself also unsafe.

@newpavlov
Copy link
Member

Closing in favor of #381.

@newpavlov newpavlov closed this Nov 8, 2023
@briansmith briansmith deleted the b/uninit-array branch May 21, 2024 05:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants