Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The
is_token
function, used exclusively for parsing the method in a request line, allows more values than it should. In particular, it allows a leading space to be parsed. This problem is not exposed in hyper, which revalidates any method extracted by httparse, otherwise I'm sure this would have been noticed sooner!Checking for a single range of valid bytes is very fast, so I've taken care to make sure that making
is_token
more complicated doesn't slow down the most common case. While exploring a variety of options, I found the existing benchmark scheme to be a bit misleading because it would test only a single method at a time, so I've made a new benchmark that roughly simulates a mix of requests. Ultimately, what I found to be a reasonable fix without any slowdown for the 99.9999% case is to checkb'A'..=b'Z'
and then fall back to a "byte map".Both methods and header names have the same set of allowed bytes, a "token", but their uses are slightly different. I thought it would make sense to rename
is_token
tois_method_token
, to mimicis_header_name_token
.