Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precise decode output slice length checking #269

Merged
merged 6 commits into from
Mar 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "base64"
version = "0.21.7"
version = "0.22.0"
authors = ["Alice Maz <alice@alicemaz.com>", "Marshall Pierce <marshall@mpierce.org>"]
description = "encodes and decodes base64 as bytes or utf8"
repository = "https://github.com/marshallpierce/rust-base64"
Expand Down
6 changes: 6 additions & 0 deletions RELEASE-NOTES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# 0.22.0

- `DecodeSliceError::OutputSliceTooSmall` is now conservative rather than precise. That is, the error will only occur if the decoded output _cannot_ fit, meaning that `Engine::decode_slice` can now be used with exactly-sized output slices. As part of this, `Engine::internal_decode` now returns `DecodeSliceError` instead of `DecodeError`, but that is not expected to affect any external callers.
- `DecodeError::InvalidLength` now refers specifically to the _number of valid symbols_ being invalid (i.e. `len % 4 == 1`), rather than just the number of input bytes. This avoids confusing scenarios when based on interpretation you could make a case for either `InvalidLength` or `InvalidByte` being appropriate.
- Decoding is somewhat faster (5-10%)

# 0.21.7

- Support getting an alphabet's contents as a str via `Alphabet::as_str()`
Expand Down
3 changes: 1 addition & 2 deletions benches/benchmarks.rs
Original file line number Diff line number Diff line change
Expand Up @@ -102,9 +102,8 @@ fn do_encode_bench_slice(b: &mut Bencher, &size: &usize) {
fn do_encode_bench_stream(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size);
fill(&mut v);
let mut buf = Vec::new();
let mut buf = Vec::with_capacity(size * 2);

buf.reserve(size * 2);
b.iter(|| {
buf.clear();
let mut stream_enc = write::EncoderWriter::new(&mut buf, &STANDARD);
Expand Down
74 changes: 60 additions & 14 deletions src/decode.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,18 +9,20 @@ use std::error;
#[derive(Clone, Debug, PartialEq, Eq)]
pub enum DecodeError {
/// An invalid byte was found in the input. The offset and offending byte are provided.
/// Padding characters (`=`) interspersed in the encoded form will be treated as invalid bytes.
///
/// Padding characters (`=`) interspersed in the encoded form are invalid, as they may only
/// be present as the last 0-2 bytes of input.
///
/// This error may also indicate that extraneous trailing input bytes are present, causing
/// otherwise valid padding to no longer be the last bytes of input.
InvalidByte(usize, u8),
/// The length of the input is invalid.
/// A typical cause of this is stray trailing whitespace or other separator bytes.
/// In the case where excess trailing bytes have produced an invalid length *and* the last byte
/// is also an invalid base64 symbol (as would be the case for whitespace, etc), `InvalidByte`
/// will be emitted instead of `InvalidLength` to make the issue easier to debug.
InvalidLength,
/// The length of the input, as measured in valid base64 symbols, is invalid.
/// There must be 2-4 symbols in the last input quad.
InvalidLength(usize),
/// The last non-padding input symbol's encoded 6 bits have nonzero bits that will be discarded.
/// This is indicative of corrupted or truncated Base64.
/// Unlike `InvalidByte`, which reports symbols that aren't in the alphabet, this error is for
/// symbols that are in the alphabet but represent nonsensical encodings.
/// Unlike [DecodeError::InvalidByte], which reports symbols that aren't in the alphabet,
/// this error is for symbols that are in the alphabet but represent nonsensical encodings.
InvalidLastSymbol(usize, u8),
/// The nature of the padding was not as configured: absent or incorrect when it must be
/// canonical, or present when it must be absent, etc.
Expand All @@ -30,8 +32,10 @@ pub enum DecodeError {
impl fmt::Display for DecodeError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Self::InvalidByte(index, byte) => write!(f, "Invalid byte {}, offset {}.", byte, index),
Self::InvalidLength => write!(f, "Encoded text cannot have a 6-bit remainder."),
Self::InvalidByte(index, byte) => {
write!(f, "Invalid symbol {}, offset {}.", byte, index)
}
Self::InvalidLength(len) => write!(f, "Invalid input length: {}", len),
Self::InvalidLastSymbol(index, byte) => {
write!(f, "Invalid last symbol {}, offset {}.", byte, index)
}
Expand All @@ -48,9 +52,7 @@ impl error::Error for DecodeError {}
pub enum DecodeSliceError {
/// A [DecodeError] occurred
DecodeError(DecodeError),
/// The provided slice _may_ be too small.
///
/// The check is conservative (assumes the last triplet of output bytes will all be needed).
/// The provided slice is too small.
OutputSliceTooSmall,
}

Expand Down Expand Up @@ -338,3 +340,47 @@ mod tests {
}
}
}

#[allow(deprecated)]
#[cfg(test)]
mod coverage_gaming {
use super::*;
use std::error::Error;

#[test]
fn decode_error() {
let _ = format!("{:?}", DecodeError::InvalidPadding.clone());
let _ = format!(
"{} {} {} {}",
DecodeError::InvalidByte(0, 0),
DecodeError::InvalidLength(0),
DecodeError::InvalidLastSymbol(0, 0),
DecodeError::InvalidPadding,
);
}

#[test]
fn decode_slice_error() {
let _ = format!("{:?}", DecodeSliceError::OutputSliceTooSmall.clone());
let _ = format!(
"{} {}",
DecodeSliceError::OutputSliceTooSmall,
DecodeSliceError::DecodeError(DecodeError::InvalidPadding)
);
let _ = DecodeSliceError::OutputSliceTooSmall.source();
let _ = DecodeSliceError::DecodeError(DecodeError::InvalidPadding).source();
}

#[test]
fn deprecated_fns() {
let _ = decode("");
let _ = decode_engine("", &crate::prelude::BASE64_STANDARD);
let _ = decode_engine_vec("", &mut Vec::new(), &crate::prelude::BASE64_STANDARD);
let _ = decode_engine_slice("", &mut [], &crate::prelude::BASE64_STANDARD);
}

#[test]
fn decoded_len_est() {
assert_eq!(3, decoded_len_estimate(4));
}
}