How can `serde` impls take better advantage of length hints? #2494

jamesmunns · 2023-07-08T18:50:52Z

Hi there, author of postcard here, I had a user report an issue where deserializing a Vec<u8> significantly overallocated. They were deserializing a Vec<u8> of length 9000, and were panicking because they failed to allocate 16KiB (this is on an embedded target with std but limited memory).

This is sort of explainable with Vec's growth strategy, which is currently power-of-two-ish, but seemed disappointing as postcard as a format will always know the length of byte arrays ahead of time.

We also tried switching to Box<[u8]>, but it looked like Vec<u8> was used as an intermediary, so it did not help.

I was able to reproduce this on desktop, and I was able to provide a workaround, roughly:

#[derive(Serialize, Deserialize)]
struct Data3 {
    #[serde(deserialize_with = "deserialize_vec")]
    v: Vec<u8>,
}

fn deserialize_vec<'de, D>(deserializer: D) -> Result<Vec<u8>, D::Error>
where
    D: Deserializer<'de>,
{
    struct MaxVisitor(PhantomData<u8>);

    impl<'de> Visitor<'de> for MaxVisitor
    {
        /// Return type of this visitor. This visitor computes the max of a
        /// sequence of values of type T, so the type of the maximum is T.
        type Value = Vec<u8>;

        fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
            formatter.write_str("a nonempty sequence of numbers")
        }

        fn visit_bytes<E>(self, v: &[u8]) -> Result<Self::Value, E>
        where
            E: de::Error,
        {
            let mut buf = Vec::with_capacity(v.len());
            buf.extend_from_slice(v);
            Ok(buf)
        }
    }

    // Create the visitor and ask the deserializer to drive it. The
    // deserializer will call visitor.visit_seq() if a seq is present in
    // the input data.
    let visitor = MaxVisitor(PhantomData);
    deserializer.deserialize_bytes(visitor)
}

Which I observed allocated exactly the right amount only once.

Is there any way I can make this always the case from postcard's perspective, at least for [u8] (and potentially str)?

Full reproduction code and output below:

Code (it's gross, sorry)

use core::fmt;
use std::{
    alloc::{GlobalAlloc, System},
    cell::UnsafeCell,
    cmp,
    fmt::Write,
    marker::PhantomData,
    sync::{
        atomic::{AtomicBool, Ordering},
        Mutex,
    },
    time::Duration,
};

use serde::{
    de::{self, SeqAccess, Visitor},
    Deserialize, Deserializer, Serialize,
};

struct LogAlloc;

#[global_allocator]
static LOGA: LogAlloc = LogAlloc;

struct LogBuf {
    buf: [u8; 16 * 1024],
    idx: usize,
}
struct LBMutex {
    lb: UnsafeCell<LogBuf>,
    busy: AtomicBool,
}
unsafe impl Sync for LBMutex {}

struct LBMGuard<'a> {
    lb: &'a mut LogBuf,
    b: &'a AtomicBool,
}

impl<'a> Drop for LBMGuard<'a> {
    fn drop(&mut self) {
        self.b.store(false, Ordering::Release);
    }
}

impl LBMutex {
    fn lock<'a>(&'a self) -> Option<LBMGuard<'a>> {
        if !self.busy.swap(true, Ordering::AcqRel) {
            unsafe {
                Some(LBMGuard {
                    lb: &mut *self.lb.get(),
                    b: &self.busy,
                })
            }
        } else {
            None
        }
    }
}

static SYSA: System = System;

static LOGBUF: LBMutex = LBMutex {
    lb: UnsafeCell::new(LogBuf {
        buf: [0u8; 16 * 1024],
        idx: 0,
    }),
    busy: AtomicBool::new(false),
};

#[derive(Serialize, Deserialize)]
struct Data1 {
    v: Vec<u8>,
}

#[derive(Serialize, Deserialize)]
struct Data2 {
    v: Box<[u8]>,
}



impl Write for LogBuf {
    fn write_str(&mut self, s: &str) -> std::fmt::Result {
        self.buf
            .get_mut(self.idx..)
            .unwrap()
            .get_mut(..s.len())
            .unwrap()
            .copy_from_slice(s.as_bytes());
        self.idx += s.len();
        Ok(())
    }
}

#[derive(Serialize, Deserialize)]
struct Data3 {
    #[serde(deserialize_with = "deserialize_vec")]
    v: Vec<u8>,
}

fn deserialize_vec<'de, D>(deserializer: D) -> Result<Vec<u8>, D::Error>
where
    D: Deserializer<'de>,
{
    struct MaxVisitor(PhantomData<u8>);

    impl<'de> Visitor<'de> for MaxVisitor
    {
        /// Return type of this visitor. This visitor computes the max of a
        /// sequence of values of type T, so the type of the maximum is T.
        type Value = Vec<u8>;

        fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
            formatter.write_str("a nonempty sequence of numbers")
        }

        fn visit_bytes<E>(self, v: &[u8]) -> Result<Self::Value, E>
        where
            E: de::Error,
        {
            let mut buf = Vec::with_capacity(v.len());
            buf.extend_from_slice(v);
            Ok(buf)
        }
    }

    // Create the visitor and ask the deserializer to drive it. The
    // deserializer will call visitor.visit_seq() if a seq is present in
    // the input data.
    let visitor = MaxVisitor(PhantomData);
    deserializer.deserialize_bytes(visitor)
}

unsafe impl GlobalAlloc for LogAlloc {
    unsafe fn alloc(&self, layout: std::alloc::Layout) -> *mut u8 {
        let ptr = <std::alloc::System as GlobalAlloc>::alloc(&SYSA, layout);
        if let Some(lbmg) = LOGBUF.lock() {
            writeln!(lbmg.lb, "ALLOC  (0x{:016X}): {:?}", ptr as usize, layout).unwrap();
        }
        ptr
    }

    unsafe fn dealloc(&self, ptr: *mut u8, layout: std::alloc::Layout) {
        if let Some(lbmg) = LOGBUF.lock() {
            writeln!(lbmg.lb, "DEALLOC(0x{:016X}): {:?}", ptr as usize, layout).unwrap();
        }
        <std::alloc::System as GlobalAlloc>::dealloc(&SYSA, ptr, layout);
    }
}

fn main() {
    println!("Hello, world!");
    let mut buffy: Vec<u8> = Vec::with_capacity(100);
    buffy.push(0);
    println!("{:?}", buffy);
    drop(buffy);

    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Making source...").unwrap()
    }
    let source = Data1 {
        v: vec![0xA5; 9000],
    };
    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Made, serializing").unwrap()
    }
    let serd = postcard::to_stdvec(&source).unwrap();
    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Serialized, deser as vec").unwrap()
    }
    let deser1: Data1 = postcard::from_bytes(&serd).unwrap();
    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(
            lbmg.lb,
            "Deser, len: {}, cap: {}",
            deser1.v.len(),
            deser1.v.capacity()
        )
        .unwrap()
    }

    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Making source2...").unwrap()
    }
    let source = Data2 {
        v: vec![0xA5; 9000].into_boxed_slice(),
    };
    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Made2, serializing2").unwrap()
    }
    let serd = postcard::to_stdvec(&source).unwrap();
    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Serialized2, deser2 as boxslice").unwrap()
    }
    let deser2: Data2 = postcard::from_bytes(&serd).unwrap();
    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Deser2, len: {}", deser2.v.len()).unwrap()
    }


    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Making source3...").unwrap()
    }
    let source = Data3 {
        v: vec![0xA5; 9000],
    };
    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Made3, serializing3").unwrap()
    }
    let serd = postcard::to_stdvec(&source).unwrap();
    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Serialized3, deser3 as boxslice").unwrap()
    }
    let deser3: Data3 = postcard::from_bytes(&serd).unwrap();
    if let Some(lbmg) = LOGBUF.lock() {
        writeln!(lbmg.lb, "Deser3, len: {}, cap: {}", deser3.v.len(), deser3.v.capacity()).unwrap()
    }

    let lbmg = LOGBUF.lock().unwrap();
    println!(
        "{}",
        core::str::from_utf8(&lbmg.lb.buf[..lbmg.lb.idx]).unwrap()
    );
}

And recorded output:

ALLOC  (0x0000600002D44040): Layout { size: 5, align: 1 (1 << 0) }
ALLOC  (0x0000600003A401C0): Layout { size: 56, align: 8 (1 << 3) }
ALLOC  (0x0000600003A40200): Layout { size: 64, align: 8 (1 << 3) }
ALLOC  (0x0000000126808800): Layout { size: 1024, align: 1 (1 << 0) }
ALLOC  (0x0000600003A40240): Layout { size: 64, align: 8 (1 << 3) }
ALLOC  (0x0000600000440070): Layout { size: 100, align: 1 (1 << 0) }
DEALLOC(0x0000600000440070): Layout { size: 100, align: 1 (1 << 0) }
Making source...
ALLOC  (0x0000000126808C00): Layout { size: 9000, align: 1 (1 << 0) }
Made, serializing
ALLOC  (0x0000600002D48000): Layout { size: 8, align: 1 (1 << 0) }
ALLOC  (0x0000600002D48010): Layout { size: 16, align: 1 (1 << 0) }
DEALLOC(0x0000600002D48000): Layout { size: 8, align: 1 (1 << 0) }
ALLOC  (0x0000600002F44000): Layout { size: 32, align: 1 (1 << 0) }
DEALLOC(0x0000600002D48010): Layout { size: 16, align: 1 (1 << 0) }
ALLOC  (0x0000600003A44000): Layout { size: 64, align: 1 (1 << 0) }
DEALLOC(0x0000600002F44000): Layout { size: 32, align: 1 (1 << 0) }
ALLOC  (0x0000600000144000): Layout { size: 128, align: 1 (1 << 0) }
DEALLOC(0x0000600003A44000): Layout { size: 64, align: 1 (1 << 0) }
ALLOC  (0x0000600001340000): Layout { size: 256, align: 1 (1 << 0) }
DEALLOC(0x0000600000144000): Layout { size: 128, align: 1 (1 << 0) }
ALLOC  (0x0000000126704080): Layout { size: 512, align: 1 (1 << 0) }
DEALLOC(0x0000600001340000): Layout { size: 256, align: 1 (1 << 0) }
ALLOC  (0x0000000127008200): Layout { size: 1024, align: 1 (1 << 0) }
DEALLOC(0x0000000126704080): Layout { size: 512, align: 1 (1 << 0) }
ALLOC  (0x0000000127008600): Layout { size: 2048, align: 1 (1 << 0) }
DEALLOC(0x0000000127008200): Layout { size: 1024, align: 1 (1 << 0) }
ALLOC  (0x0000000127008E00): Layout { size: 4096, align: 1 (1 << 0) }
DEALLOC(0x0000000127008600): Layout { size: 2048, align: 1 (1 << 0) }
ALLOC  (0x0000000127009E00): Layout { size: 8192, align: 1 (1 << 0) }
DEALLOC(0x0000000127008E00): Layout { size: 4096, align: 1 (1 << 0) }
ALLOC  (0x000000012700BE00): Layout { size: 16384, align: 1 (1 << 0) }
DEALLOC(0x0000000127009E00): Layout { size: 8192, align: 1 (1 << 0) }
Serialized, deser as vec
ALLOC  (0x0000000127008200): Layout { size: 4096, align: 1 (1 << 0) }
ALLOC  (0x0000000127009E00): Layout { size: 8192, align: 1 (1 << 0) }
DEALLOC(0x0000000127008200): Layout { size: 4096, align: 1 (1 << 0) }
ALLOC  (0x000000012700FE00): Layout { size: 16384, align: 1 (1 << 0) }
DEALLOC(0x0000000127009E00): Layout { size: 8192, align: 1 (1 << 0) }
Deser, len: 9000, cap: 16384
Making source2...
ALLOC  (0x0000000127013E00): Layout { size: 9000, align: 1 (1 << 0) }
Made2, serializing2
ALLOC  (0x0000600002D48000): Layout { size: 8, align: 1 (1 << 0) }
ALLOC  (0x0000600002D48010): Layout { size: 16, align: 1 (1 << 0) }
DEALLOC(0x0000600002D48000): Layout { size: 8, align: 1 (1 << 0) }
ALLOC  (0x0000600002F44000): Layout { size: 32, align: 1 (1 << 0) }
DEALLOC(0x0000600002D48010): Layout { size: 16, align: 1 (1 << 0) }
ALLOC  (0x0000600003A44000): Layout { size: 64, align: 1 (1 << 0) }
DEALLOC(0x0000600002F44000): Layout { size: 32, align: 1 (1 << 0) }
ALLOC  (0x0000600000144000): Layout { size: 128, align: 1 (1 << 0) }
DEALLOC(0x0000600003A44000): Layout { size: 64, align: 1 (1 << 0) }
ALLOC  (0x0000600001340000): Layout { size: 256, align: 1 (1 << 0) }
DEALLOC(0x0000600000144000): Layout { size: 128, align: 1 (1 << 0) }
ALLOC  (0x0000000126704080): Layout { size: 512, align: 1 (1 << 0) }
DEALLOC(0x0000600001340000): Layout { size: 256, align: 1 (1 << 0) }
ALLOC  (0x0000000127008200): Layout { size: 1024, align: 1 (1 << 0) }
DEALLOC(0x0000000126704080): Layout { size: 512, align: 1 (1 << 0) }
ALLOC  (0x0000000127008600): Layout { size: 2048, align: 1 (1 << 0) }
DEALLOC(0x0000000127008200): Layout { size: 1024, align: 1 (1 << 0) }
ALLOC  (0x0000000127008E00): Layout { size: 4096, align: 1 (1 << 0) }
DEALLOC(0x0000000127008600): Layout { size: 2048, align: 1 (1 << 0) }
ALLOC  (0x0000000127009E00): Layout { size: 8192, align: 1 (1 << 0) }
DEALLOC(0x0000000127008E00): Layout { size: 4096, align: 1 (1 << 0) }
ALLOC  (0x0000000127016200): Layout { size: 16384, align: 1 (1 << 0) }
DEALLOC(0x0000000127009E00): Layout { size: 8192, align: 1 (1 << 0) }
Serialized2, deser2 as boxslice
ALLOC  (0x0000000127008200): Layout { size: 4096, align: 1 (1 << 0) }
ALLOC  (0x0000000127009E00): Layout { size: 8192, align: 1 (1 << 0) }
DEALLOC(0x0000000127008200): Layout { size: 4096, align: 1 (1 << 0) }
ALLOC  (0x000000012701A200): Layout { size: 16384, align: 1 (1 << 0) }
DEALLOC(0x0000000127009E00): Layout { size: 8192, align: 1 (1 << 0) }
ALLOC  (0x000000012701E200): Layout { size: 9000, align: 1 (1 << 0) }
DEALLOC(0x000000012701A200): Layout { size: 16384, align: 1 (1 << 0) }
Deser2, len: 9000
Making source3...
ALLOC  (0x0000000127008200): Layout { size: 9000, align: 1 (1 << 0) }
Made3, serializing3
ALLOC  (0x0000600002D48000): Layout { size: 8, align: 1 (1 << 0) }
ALLOC  (0x0000600002D48010): Layout { size: 16, align: 1 (1 << 0) }
DEALLOC(0x0000600002D48000): Layout { size: 8, align: 1 (1 << 0) }
ALLOC  (0x0000600002F44000): Layout { size: 32, align: 1 (1 << 0) }
DEALLOC(0x0000600002D48010): Layout { size: 16, align: 1 (1 << 0) }
ALLOC  (0x0000600003A44000): Layout { size: 64, align: 1 (1 << 0) }
DEALLOC(0x0000600002F44000): Layout { size: 32, align: 1 (1 << 0) }
ALLOC  (0x0000600000144000): Layout { size: 128, align: 1 (1 << 0) }
DEALLOC(0x0000600003A44000): Layout { size: 64, align: 1 (1 << 0) }
ALLOC  (0x0000600001340000): Layout { size: 256, align: 1 (1 << 0) }
DEALLOC(0x0000600000144000): Layout { size: 128, align: 1 (1 << 0) }
ALLOC  (0x0000000126704080): Layout { size: 512, align: 1 (1 << 0) }
DEALLOC(0x0000600001340000): Layout { size: 256, align: 1 (1 << 0) }
ALLOC  (0x000000012700A600): Layout { size: 1024, align: 1 (1 << 0) }
DEALLOC(0x0000000126704080): Layout { size: 512, align: 1 (1 << 0) }
ALLOC  (0x000000012700AA00): Layout { size: 2048, align: 1 (1 << 0) }
DEALLOC(0x000000012700A600): Layout { size: 1024, align: 1 (1 << 0) }
ALLOC  (0x000000012701A200): Layout { size: 4096, align: 1 (1 << 0) }
DEALLOC(0x000000012700AA00): Layout { size: 2048, align: 1 (1 << 0) }
ALLOC  (0x000000012701B200): Layout { size: 8192, align: 1 (1 << 0) }
DEALLOC(0x000000012701A200): Layout { size: 4096, align: 1 (1 << 0) }
ALLOC  (0x0000000127020600): Layout { size: 16384, align: 1 (1 << 0) }
DEALLOC(0x000000012701B200): Layout { size: 8192, align: 1 (1 << 0) }
Serialized3, deser3 as boxslice
ALLOC  (0x0000000127024600): Layout { size: 9000, align: 1 (1 << 0) }
Deser3, len: 9000, cap: 9000

Thanks!

The text was updated successfully, but these errors were encountered:

jamesmunns · 2023-07-08T18:58:00Z

Also as a note, postcard does all of the following already:

it provides a size hint in SeqAccess
it implements deserialize_bytes and deserialize_byte_buf in an efficient manner
deserialize_seq also has length information

jamesmunns · 2023-07-08T19:06:04Z

Ah, I just had someone link this code to me, which seems the "cautious" check is kicking in.

I think this makes sense to avoid DoS issues, but in the case of deserialize_bytes, I already have the slice of data that will be used to populate the buffer, which means that the wire format can't "lie" about how much data there is (or deserialization would fail). I know this is probably a rare case, but if you have any ideas about how I could signal this level of trust from postcard, I would definitely be interested!

jamesmunns · 2023-07-08T19:43:42Z

Tagging #850, as it seems related. The "we are sure of the size" protection I mentioned above only applies to byte slices and strings (which are also byte slices), and since we don't have specialization for Vec<u8> compared to Vec<T>, I'm also not sure what can be done here, other than maybe some opt-in way of signalling to serde to not be cautious.

dtolnay · 2023-07-08T20:30:01Z

We could change cautious to take into account the size of the allocation, instead of the number of elements. Something like:

const MAX_PREALLOC_BYTES: usize = 1024 * 1024;

if mem::size_of::<T>() == 0 {
    0
} else {
    cmp::min(hint.unwrap_or(0), ceil(MAX_PREALLOC_BYTES / size_of::<T>()))
}

jamesmunns · 2023-07-08T22:46:25Z

I think this sounds reasonable, and would address my issue! I can't think of a better way to have serde know the right thing to do, despite sorta having all the answers it needs in the separate parts, without specialization, or some API changes.

jamesmunns · 2023-07-09T00:52:21Z

Thank you @dtolnay!

This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [serde](https://serde.rs) ([source](https://github.com/serde-rs/serde)) | dependencies | patch | `1.0.167` -> `1.0.171` | --- ### Release Notes <details> <summary>serde-rs/serde (serde)</summary> ### [`v1.0.171`](https://github.com/serde-rs/serde/releases/tag/v1.0.171) [Compare Source](serde-rs/serde@v1.0.170...v1.0.171) - Support `derive(Deserialize)` on unit structs that have const generics ([#2500](serde-rs/serde#2500), thanks [@Baptistemontan](https://github.com/Baptistemontan)) ### [`v1.0.170`](https://github.com/serde-rs/serde/releases/tag/v1.0.170) [Compare Source](serde-rs/serde@v1.0.169...v1.0.170) - Produce error message on suffixed string literals inside serde attributes ([#2242](serde-rs/serde#2242)) - Support single identifier as unbraced default value for const generic parameter ([#2449](serde-rs/serde#2449)) ### [`v1.0.169`](https://github.com/serde-rs/serde/releases/tag/v1.0.169) [Compare Source](serde-rs/serde@v1.0.168...v1.0.169) - Add Deserializer::deserialize_identifier support for adjacently tagged enums ([#2475](serde-rs/serde#2475), thanks [@Baptistemontan](https://github.com/Baptistemontan)) - Fix unused_braces lint in generated Deserialize impl that uses braced const generic expressions ([#2414](serde-rs/serde#2414)) ### [`v1.0.168`](https://github.com/serde-rs/serde/releases/tag/v1.0.168) [Compare Source](serde-rs/serde@v1.0.167...v1.0.168) - Allow `serde::de::IgnoredAny` to be the type for a `serde(flatten)` field ([#2436](serde-rs/serde#2436), thanks [@Mingun](https://github.com/Mingun)) - Allow larger preallocated capacity for smaller elements ([#2494](serde-rs/serde#2494)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).  Co-authored-by: cabr2-bot <cabr2.help@gmail.com> Co-authored-by: crapStone <crapstone01@gmail.com> Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1959 Reviewed-by: crapStone <crapstone01@gmail.com> Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org> Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>

jamesmunns changed the title ~~How can serde impl's take better advantage of length hints?~~ How can serde impls take better advantage of length hints? Jul 8, 2023

dtolnay mentioned this issue Jul 9, 2023

Allow larger preallocated capacity for smaller elements #2495

Merged

dtolnay closed this as completed in #2495 Jul 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can `serde` impls take better advantage of length hints? #2494

How can `serde` impls take better advantage of length hints? #2494

jamesmunns commented Jul 8, 2023 •

edited

jamesmunns commented Jul 8, 2023

jamesmunns commented Jul 8, 2023

jamesmunns commented Jul 8, 2023

dtolnay commented Jul 8, 2023

jamesmunns commented Jul 8, 2023

jamesmunns commented Jul 9, 2023

How can serde impls take better advantage of length hints? #2494

How can serde impls take better advantage of length hints? #2494

Comments

jamesmunns commented Jul 8, 2023 • edited

Code (it's gross, sorry)

jamesmunns commented Jul 8, 2023

jamesmunns commented Jul 8, 2023

jamesmunns commented Jul 8, 2023

dtolnay commented Jul 8, 2023

jamesmunns commented Jul 8, 2023

jamesmunns commented Jul 9, 2023

How can `serde` impls take better advantage of length hints? #2494

How can `serde` impls take better advantage of length hints? #2494

jamesmunns commented Jul 8, 2023 •

edited