Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make <&str as Arbitrary>::arbitrary_take_rest less likely to fail #151

Merged
merged 1 commit into from
Jun 13, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
48 changes: 30 additions & 18 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -834,29 +834,33 @@ where
}
}

fn arbitrary_str<'a>(u: &mut Unstructured<'a>, size: usize) -> Result<&'a str> {
match str::from_utf8(u.peek_bytes(size).unwrap()) {
Ok(s) => {
u.bytes(size).unwrap();
Ok(s)
}
Err(e) => {
let i = e.valid_up_to();
let valid = u.bytes(i).unwrap();
let s = unsafe {
debug_assert!(str::from_utf8(valid).is_ok());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: should we be including this? fuzzing is usually run with debug assertions on so the debug_assert doesn't get us much. I am comfortable with more thoroughly documenting the safety situation instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a couple other debug asserts that also maybe should only be enabled when running the arbitrary crate tests. We can figure out what to do with all of them in a follow up, no need to hold this PR up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This debug_assert comes from the previous implementation of arbitrary; if you think it’d be better to replace it with a comment that valid_up_to can only generate valid strings, or with another macro that’d assert only on eg. an internal-tests arbitrary feature, I’d be happy to make a follow-up PR :)

str::from_utf8_unchecked(valid)
};
Ok(s)
}
}
}

impl<'a> Arbitrary<'a> for &'a str {
fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self> {
let size = u.arbitrary_len::<u8>()?;
match str::from_utf8(u.peek_bytes(size).unwrap()) {
Ok(s) => {
u.bytes(size).unwrap();
Ok(s)
}
Err(e) => {
let i = e.valid_up_to();
let valid = u.bytes(i).unwrap();
let s = unsafe {
debug_assert!(str::from_utf8(valid).is_ok());
str::from_utf8_unchecked(valid)
};
Ok(s)
}
}
arbitrary_str(u, size)
}

fn arbitrary_take_rest(u: Unstructured<'a>) -> Result<Self> {
let bytes = u.take_rest();
str::from_utf8(bytes).map_err(|_| Error::IncorrectFormat)
fn arbitrary_take_rest(mut u: Unstructured<'a>) -> Result<Self> {
let size = u.len();
arbitrary_str(&mut u, size)
}

#[inline]
Expand Down Expand Up @@ -1255,6 +1259,7 @@ mod test {

#[test]
fn arbitrary_take_rest() {
// Basic examples
let x = [1, 2, 3, 4];
assert_eq!(
checked_arbitrary_take_rest::<&[u8]>(Unstructured::new(&x)).unwrap(),
Expand All @@ -1273,6 +1278,7 @@ mod test {
"\x01\x02\x03\x04"
);

// Empty remainder
assert_eq!(
checked_arbitrary_take_rest::<&[u8]>(Unstructured::new(&[])).unwrap(),
&[]
Expand All @@ -1281,6 +1287,12 @@ mod test {
checked_arbitrary_take_rest::<Vec<u8>>(Unstructured::new(&[])).unwrap(),
&[]
);

// Cannot consume all but can consume part of the input
assert_eq!(
checked_arbitrary_take_rest::<String>(Unstructured::new(&[1, 0xFF, 2])).unwrap(),
"\x01"
);
}

#[test]
Expand Down