Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Also return remaining bytes in FromBytes conversion methods #1059

Merged
merged 1 commit into from Apr 23, 2024

Conversation

jswrenn
Copy link
Collaborator

@jswrenn jswrenn commented Mar 19, 2024

This change makes the slice and non-slice conversion methods consistent, and provides a lightweight way (i.e., without Ref) to access the leftover bytes after conversions. For instance, this:

trait FromBytes {
    fn ref_from_prefix(bytes: &[u8]) -> Option<&Self>
    where
        Self: KnownLayout + NoCell;
}

...becomes this, in this PR:

trait FromBytes {
    fn ref_from_prefix(bytes: &[u8]) -> Option<(&Self, &[u8])>
    where
        Self: KnownLayout + NoCell;
}

...where the second element of the tuple are the remaining bytes. Providing the remainder eases the pattern where a T is parsed from a buffer, and then the buffer is advanced to exclude the parsed T.

This PR tests out the 'feel' of tuple-returning conversion methods. Alternatively, we could provide a mirror set of _split methods.

Addresses #1051, but is the antithesis of #884. Inspired by #1051 (comment). It's syntactically cheap for users to discard remaining bytes at call sites if they wish, but the alternative — discarding the excess and requiring users to go through Ref — is syntactically and cognitively expensive.

@jswrenn jswrenn requested a review from joshlf March 19, 2024 22:34
@smalis-msft
Copy link

While I'm obviously in favor of doing this in general, I feel like modifying the existing methods would be too large of a breaking change to be worthwhile, which was why I suggested new methods.

@jswrenn
Copy link
Collaborator Author

jswrenn commented Mar 20, 2024

Hm, we'll need to balance the costs of subjecting current users to breaking changes, against the costs of subjecting current and future users to a more bloated API. Adding _split methods consistently would require adding eight additional methods to FromBytes (which has 14 methods in our latest 0.8 pre-release):

  mut_from
  mut_from_prefix
+ mut_from_prefix_split
  mut_from_suffix
+ mut_from_suffix_split
  mut_slice_from
  mut_slice_from_prefix
+ mut_slice_from_prefix_split
  mut_slice_from_suffix
+ mut_slice_from_suffix_split
  read_from
  read_from_prefix
+ read_from_prefix_split
  read_from_suffix
+ read_from_suffix_split
  ref_from
  ref_from_prefix
+ ref_from_prefix_split
  ref_from_suffix
+ ref_from_suffix_split
  slice_from_prefix
  slice_from_suffix

That's quite a hefty addition, and whatever course of action we take here will also likely be taken for our upcoming TryFromBytes trait, too.

Since we're preparing for a major breaking release anyways, we have a bit more license than usual to consider breaking changes.

@kupiakos
Copy link
Contributor

kupiakos commented Mar 30, 2024

As I discuss in #884 (comment), I'm strongly in support of not removing the plain-casting methods that don't return the suffix. Embedded code should do the minimum work necessary for the job. Zerocopy has a few design decisions that are good for networking code, but bad for embedded, like how all of the byteorder items are also unaligned. Removing these methods pushes it further in an embedded-hostile direction.

I personally find the zerocopy APIs lacking in useful transitive operations like this, so I would prefer the "bloat" of adding these methods and keeping the plain-casting ones.

What do you think of these methods starting with split_ instead of ending in _split? split_mut_from_prefix reads clearly in the imperative mood and highlights that it's doing more than just a bounds-check and pointer cast: it's splitting as well. This split is an extra operation, both cognitively and for the compiler.

I don't care about the *_from_suffix API and I don't know who does - we should be consistent where it makes sense to do so. Perhaps as a middle ground, this PR can introduce only splitting versions for the prefix APIs, which adds 5 methods instead of 10:

  mut_from
  mut_from_prefix
+ split_mut_from_prefix
  mut_from_suffix
  mut_slice_from
  mut_slice_from_prefix
+ split_mut_slice_from_prefix
  mut_slice_from_suffix
  read_from
  read_from_prefix
+ split_read_from_prefix
  read_from_suffix
  ref_from
  ref_from_prefix
+ split_ref_from_prefix
  ref_from_suffix
  slice_from_prefix
+ split_slice_from_prefix
  slice_from_suffix

@smalis-msft
Copy link

smalis-msft commented Apr 1, 2024

I don't care about the *_from_suffix API and I don't know who does

We use it for parsing certain things as well as prefix, and would strongly prefer to have split versions of it as well if you decide to go this route.

@jswrenn
Copy link
Collaborator Author

jswrenn commented Apr 5, 2024

I've made a discussion post exploring some of the naming and return type design space: #1095

@jswrenn jswrenn enabled auto-merge April 19, 2024 18:38
@joshlf
Copy link
Member

joshlf commented Apr 19, 2024

I'm inclined to merge this based on the following reasoning:

  • The ergonomics of discarding an unneeded prefix/suffix are better than the ergonomics of reconstructing a prefix/suffix if one isn't returned by the API
  • From what little visibility we have into use cases in the ecosystem, it seems that wanting the prefix/suffix is the more common situation
  • For users who care about not relying on the compiler to perform optimizations, techniques like the one described in this comment are available
  • More generally, we are aware that our current internal software organization is based on the assumption that the compiler will aggressively optimize, and we are actively considering how to provide better support for users who aren't comfortable with this situation: Support users who want guarantees regarding optimization #1125

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
This change makes the slice and non-slice conversion methods
consistent, and provides a lightweight way (i.e., without `Ref`)
to access the leftover bytes after conversions.
@jswrenn jswrenn force-pushed the length-lossless-trait-conversion-methods branch from bfa0caf to e0e9b0d Compare April 23, 2024 19:41
@jswrenn jswrenn added this pull request to the merge queue Apr 23, 2024
Merged via the queue into main with commit 4f86a22 Apr 23, 2024
210 checks passed
@jswrenn jswrenn deleted the length-lossless-trait-conversion-methods branch April 23, 2024 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants