New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can serde
impls take better advantage of length hints?
#2494
Comments
serde
impl's take better advantage of length hints?serde
impls take better advantage of length hints?
Also as a note, postcard does all of the following already: |
Ah, I just had someone link this code to me, which seems the "cautious" check is kicking in. I think this makes sense to avoid DoS issues, but in the case of deserialize_bytes, I already have the slice of data that will be used to populate the buffer, which means that the wire format can't "lie" about how much data there is (or deserialization would fail). I know this is probably a rare case, but if you have any ideas about how I could signal this level of trust from postcard, I would definitely be interested! |
Tagging #850, as it seems related. The "we are sure of the size" protection I mentioned above only applies to byte slices and strings (which are also byte slices), and since we don't have specialization for |
We could change const MAX_PREALLOC_BYTES: usize = 1024 * 1024;
if mem::size_of::<T>() == 0 {
0
} else {
cmp::min(hint.unwrap_or(0), ceil(MAX_PREALLOC_BYTES / size_of::<T>()))
} |
I think this sounds reasonable, and would address my issue! I can't think of a better way to have serde know the right thing to do, despite sorta having all the answers it needs in the separate parts, without specialization, or some API changes. |
Thank you @dtolnay! |
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [serde](https://serde.rs) ([source](https://github.com/serde-rs/serde)) | dependencies | patch | `1.0.167` -> `1.0.171` | --- ### Release Notes <details> <summary>serde-rs/serde (serde)</summary> ### [`v1.0.171`](https://github.com/serde-rs/serde/releases/tag/v1.0.171) [Compare Source](serde-rs/serde@v1.0.170...v1.0.171) - Support `derive(Deserialize)` on unit structs that have const generics ([#​2500](serde-rs/serde#2500), thanks [@​Baptistemontan](https://github.com/Baptistemontan)) ### [`v1.0.170`](https://github.com/serde-rs/serde/releases/tag/v1.0.170) [Compare Source](serde-rs/serde@v1.0.169...v1.0.170) - Produce error message on suffixed string literals inside serde attributes ([#​2242](serde-rs/serde#2242)) - Support single identifier as unbraced default value for const generic parameter ([#​2449](serde-rs/serde#2449)) ### [`v1.0.169`](https://github.com/serde-rs/serde/releases/tag/v1.0.169) [Compare Source](serde-rs/serde@v1.0.168...v1.0.169) - Add Deserializer::deserialize_identifier support for adjacently tagged enums ([#​2475](serde-rs/serde#2475), thanks [@​Baptistemontan](https://github.com/Baptistemontan)) - Fix unused_braces lint in generated Deserialize impl that uses braced const generic expressions ([#​2414](serde-rs/serde#2414)) ### [`v1.0.168`](https://github.com/serde-rs/serde/releases/tag/v1.0.168) [Compare Source](serde-rs/serde@v1.0.167...v1.0.168) - Allow `serde::de::IgnoredAny` to be the type for a `serde(flatten)` field ([#​2436](serde-rs/serde#2436), thanks [@​Mingun](https://github.com/Mingun)) - Allow larger preallocated capacity for smaller elements ([#​2494](serde-rs/serde#2494)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42LjAiLCJ1cGRhdGVkSW5WZXIiOiIzNi42LjAiLCJ0YXJnZXRCcmFuY2giOiJkZXZlbG9wIn0=--> Co-authored-by: cabr2-bot <cabr2.help@gmail.com> Co-authored-by: crapStone <crapstone01@gmail.com> Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1959 Reviewed-by: crapStone <crapstone01@gmail.com> Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org> Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Hi there, author of postcard here, I had a user report an issue where deserializing a
Vec<u8>
significantly overallocated. They were deserializing aVec<u8>
of length 9000, and were panicking because they failed to allocate16KiB
(this is on an embedded target with std but limited memory).This is sort of explainable with Vec's growth strategy, which is currently power-of-two-ish, but seemed disappointing as postcard as a format will always know the length of byte arrays ahead of time.
We also tried switching to
Box<[u8]>
, but it looked likeVec<u8>
was used as an intermediary, so it did not help.I was able to reproduce this on desktop, and I was able to provide a workaround, roughly:
Which I observed allocated exactly the right amount only once.
Is there any way I can make this always the case from postcard's perspective, at least for
[u8]
(and potentiallystr
)?Full reproduction code and output below:
Code (it's gross, sorry)
And recorded output:
Thanks!
The text was updated successfully, but these errors were encountered: