Added possibility to pass uninit arrays to random generator #271

AngelicosPhosphoros · 2022-07-08T15:03:22Z

This can be useful to skip extra work on caller side and it is more logical because we don't really need to have initialized values to fill them with new values.

AngelicosPhosphoros · 2022-07-08T15:05:37Z

I don't really know how platforms except Linux and Windows operate so here may be things which I can implement improperly. E.g. is memory which we send to JS to let it fill with random bytes must be initialized before that.

AngelicosPhosphoros · 2022-07-08T15:07:30Z

src/windows.rs


 const BCRYPT_USE_SYSTEM_PREFERRED_RNG: u32 = 0x00000002;

 #[link(name = "bcrypt")]
 extern "system" {
    fn BCryptGenRandom(
        hAlgorithm: *mut c_void,
-        pBuffer: *mut u8,
+        pBuffer: *mut MaybeUninit<u8>,


T and MaybeUninit is equal for FFI purposes: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#layout

AngelicosPhosphoros · 2022-07-08T15:10:18Z

src/lib.rs

+    // Help miri to catch mistakes.
+    // If user calls `assume_init()` on elements,
+    // miri would show error.
+    #[cfg(miri)]
+    if res.is_err() {
+        dest.fill(MaybeUninit::uninit());
+    }


I am thinking that maybe this is better to be put into rand crate and let rand use getrandom_to_uninit even for initialized slices.

AngelicosPhosphoros · 2022-07-08T15:10:55Z

Also related PR: rust-random/rand#1241

notgull · 2022-07-08T18:53:58Z

I like this idea. Not only because of the potential utility, but because I think having methods like this to deal with uninit slices should be a pattern throughout the ecosystem.

My only thing is that I wonder if using ReadBuf would be a better fit.

josephlr · 2022-07-08T19:05:12Z

Personally, I would want to wait until there's a standard interface in the standard library (ReadBuf), instead of implementing our own non-standard interface to deal with uninitialized buffers.

josephlr · 2022-07-08T20:10:31Z

Also, as this is explicitly a performance improvement, could you also add some benchmarks so that we can see the perf impact of this change (vs just zeroing the uninit buffer and calling getrandom normally).

I would guess the best case would be the rdrand implementation, as that's the fastest RNG we have in here IIRC

AngelicosPhosphoros · 2022-07-08T20:52:01Z

Personally, I would want to wait until there's a standard interface in the standard library (ReadBuf), instead of implementing our own non-standard interface to deal with uninitialized buffers.

It looks good but it is defined in std library and not in core and getrandom can be used in environments without std.

AngelicosPhosphoros · 2022-07-08T22:22:46Z

@josephlr @notgull What you think about this updated API?

notgull · 2022-07-08T23:58:09Z

I like this!

Is there any particular reason why ReadBuf isn't available on core?

josephlr · 2022-07-09T00:35:04Z

src/3ds.rs

@@ -7,10 +7,12 @@
 // except according to those terms.

 //! Implementation for Nintendo 3DS
+use core::mem::MaybeUninit;


Note: This increases our MSRV to 1.36 (current is 1.34) so is a breaking change. However, if we gate everything behind the read_buf feature, this shouldn't be an issue, as it will require nightly anyway.

josephlr · 2022-07-09T00:39:45Z

Personally, I would want to wait until there's a standard interface in the standard library (ReadBuf), instead of implementing our own non-standard interface to deal with uninitialized buffers.

It looks good but it is defined in std library and not in core and getrandom can be used in environments without std.

Then I think for now we should add an off-by-default "read_buf" feature that:

Depends on the "std" features
Enables the read_buf nightly feature
Documents that the feature requires libstd and that it requires a recent-ish nightly

Then we could have code like this in our Cargo.toml:

# Nightly-only implementation with `std::io::ReadBuf` struct.
read_buf = ["std"]

and in lib.rs

#![cfg_attr(feature = "read_buf", feature(read_buf))]

#[cfg(feature = "std")]
extern crate std;
#[cfg(feature = "read_buf")]
use std::io::ReadBuf;

Then everywhere else we can just refer to ReadBuf or crate::ReadBuf. That way if ReadBuf moves to core, we only have to change 1-2 lines.

josephlr · 2022-07-09T00:56:52Z

@AngelicosPhosphoros before we get too far into this, could you add some benchmarks to show that this is actually faster when dealing with uninitialized buffers? I don't want you to waste your time if it turns out there isn't a notable difference.

Specifically, I would want to compare:

Directly reading random data into an uninitialized buffer
Initializing a buffer (say with zero), then calling the normal getrandom method

I would recommend focusing on the rdrand implementation or the linux implementation for benchmarking, as those are the two fastest IIRC, so it's where the cost of writing the buffer twice would be the most noticeable. Feel free to add the benchmarks to this PR using #[bench].

If it turns out there's a statistically significant improvement, I think there is a path to incorporate such functionality into getrandom. However, it might be the case that this optimization doesn't matter for getrandom (as it calls the underlying OS RNG, which can be slow compared to memset/bzero). In that case, this optimization might only matter for dealing with a userspace CSPRNG, so such uninitialized buffer code would be best to live in the rand (and similar) creates, rather than this crate.

josephlr · 2022-07-09T02:13:08Z

See https://godbolt.org/z/8WKn6oGrT for some example implementations using the existing API and the proposed new API.

The generated code confirms that regardless of how we use the existing API, it basically becomes a call to memset+getrandom_inner, while using the new API lets us only have a call to getrandom_inner.

Now it's just a question if getrandom_inner is fast enough for the initial call to memset to matter.

josephlr · 2022-07-10T07:36:26Z

@AngelicosPhosphoros actually I just realized that we don't need this PR to benchmark the cost of initialization + calling getrandom. #272 adds benchmarks that should be able to measure the cost.

newpavlov · 2022-07-10T10:49:09Z

On Linux and "common" hardware memset is approximately 2 orders of magnitude faster than calling getrandom (~10 GB/s vs ~80 MB/s), so I highly doubt we will be able to see the performance difference on common platforms. The situation may change on constrained targets (e.g. embedded devices).

Either way, I don't think it's worth to use ReadBuf and MaybeUninit in our case. A better solution could be something like this:

// It's safe to assume that the buffer gets fully filled if the function returns `Ok(())`
pub unsafe fn getrandom_raw(buf_ptr: *mut u8, buf_len: usize) -> Result<(), Error> { .. }

#[inline(always)]
pub getrandom(buf: &mut [u8]) ->  Result<(), Error> {
    unsafe { getrandom_raw(buf.as_mut_ptr(), buf.len()) }
}

josephlr · 2022-08-29T00:32:33Z

I think @newpavlov's suggestion in #271 (comment) is the way we will want to go with this. Closing in favor of #279

AngelicosPhosphoros commented Jul 8, 2022

View reviewed changes

Added possibility to pass uninit arrays to random generator

2644b55

josephlr reviewed Jul 9, 2022

View reviewed changes

josephlr mentioned this pull request Jul 13, 2022

Add/Rework benchmarks to track initialization cost #272

Merged

newpavlov mentioned this pull request Aug 18, 2022

Implement raw API #279

Closed

josephlr closed this Aug 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added possibility to pass uninit arrays to random generator #271

Added possibility to pass uninit arrays to random generator #271

AngelicosPhosphoros commented Jul 8, 2022

AngelicosPhosphoros commented Jul 8, 2022

AngelicosPhosphoros Jul 8, 2022

AngelicosPhosphoros Jul 8, 2022

AngelicosPhosphoros commented Jul 8, 2022

notgull commented Jul 8, 2022

josephlr commented Jul 8, 2022

josephlr commented Jul 8, 2022 •

edited

AngelicosPhosphoros commented Jul 8, 2022

AngelicosPhosphoros commented Jul 8, 2022

notgull commented Jul 8, 2022

josephlr Jul 9, 2022 •

edited

josephlr commented Jul 9, 2022

josephlr commented Jul 9, 2022

josephlr commented Jul 9, 2022 •

edited

josephlr commented Jul 10, 2022

newpavlov commented Jul 10, 2022

josephlr commented Aug 29, 2022

Added possibility to pass uninit arrays to random generator #271

Added possibility to pass uninit arrays to random generator #271

Conversation

AngelicosPhosphoros commented Jul 8, 2022

AngelicosPhosphoros commented Jul 8, 2022

AngelicosPhosphoros Jul 8, 2022

Choose a reason for hiding this comment

AngelicosPhosphoros Jul 8, 2022

Choose a reason for hiding this comment

AngelicosPhosphoros commented Jul 8, 2022

notgull commented Jul 8, 2022

josephlr commented Jul 8, 2022

josephlr commented Jul 8, 2022 • edited

AngelicosPhosphoros commented Jul 8, 2022

AngelicosPhosphoros commented Jul 8, 2022

notgull commented Jul 8, 2022

josephlr Jul 9, 2022 • edited

Choose a reason for hiding this comment

josephlr commented Jul 9, 2022

josephlr commented Jul 9, 2022

josephlr commented Jul 9, 2022 • edited

josephlr commented Jul 10, 2022

newpavlov commented Jul 10, 2022

josephlr commented Aug 29, 2022

josephlr commented Jul 8, 2022 •

edited

josephlr Jul 9, 2022 •

edited

josephlr commented Jul 9, 2022 •

edited