Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify what should happen when a Reader method returns an error result #21

Open
briansmith opened this issue Jun 21, 2018 · 3 comments

Comments

@briansmith
Copy link
Owner

Definitely my expectation is that a user of Reader will stop using the reader after a Reader method fails. However, that expectation isn't documented anywhere.

In discussing PR #20 with @oherrala (offline), the point was made that, based on the documentation in the untrusted module, no Reader methods must ever panic even if they are called after a Reader method returned an error.

@oherrala Also suggested that we change things so that, once a Reader method returns an error, then all future calls to all other Reader methods would return an error. I brought up some concerns that this would add an extra check in each method that might hurt performance; we'd need to see how the optimizer copes with this. OTOHO @oherrala brought up the point that most untrusted users probably care much more about safety than absolute performance. Still, it would be useful to try to minimize the performance impact based on the "performance is an aspect of security" mindset. One way to do it would be to set the Reader to the end of the input, and then only check the "previous error" bit when we detect we're at the end of the stream. The more problematic aspect of this idea is that there are methods like mark(), peek(), and skip_to_end() that don't returns a Result. It doesn't make much sense to have this sticky error bit only affect a subset of the methods of the Reader.

Another alternative would be to have every Reader method consume the reader, i.e. take self instead of &mut self as a parameter. Then every method would return Result<(Reader, [current result type]), Error>. We'd need to do some experimenting (e.g. rewrite webpki) to validate such an API design.

@briansmith
Copy link
Owner Author

Reader::read_byte() attempts to return one byte and advance the Reader's internal cursor, returning either the byte read or an error indicating that there's no more input. There are some other methods that could be thought of as logically calling read_byte() multiple times and then returning either a bunch of bytes at once or an error indicating that the end of the input was reached. If we think about it that way, then any Reader method that reads, when it fails, could very well set the internal cursor to the end of the input, since that's what would happen if the user attempted to access those same bytes using multiple read_byte() calls.

Still that doesn't seem very satisfying for Reader::at_end(). Are we really at the end of the input because that's the end of a (potentially) valid input, or are we at the end of an input that we know to be too short?

@briansmith
Copy link
Owner Author

The best solution for this is probably to change methods that take a &mut self (where self: Reader) to instead take self and return self only on success. For example, a method of type &mut self -> Result<u8, EndOfInput> would change to have signature self -> Result<(self, u8), EndOfInput>.

Of course, that would be a highly disruptive change!

@briansmith
Copy link
Owner Author

I prototyped a refactoring of this library where every fallible Reader method was changed from fn(&mut self, ...) -> Result<T, EndOfInput> to fn(self, ...) -> Result<(Self, T), EndOfInput>. This refactoring makes it impossible to use a Reader after it has returned an error. However, I found it to be pretty annoying to use.

I think instead we should investigate changing the semantics to be "whenever a method fails, the Reader is left in the same state as it previously was in." If we do that, we should also consider replacing the use of Result<T, EndOfInput> with Option<T>.

Those changes seem straightforward as far as the implementation of untrusted is concerned. However, I worry that most things built on top of untrusted, will not safely implement those same semantics. More experimentation will be required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant