New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy Rack 2.2 query parser to retain compatibility. #47080
Conversation
1d9e3d4
to
abf7026
Compare
@tenderlove does this mean we don't need perform_deep_munge anymore in Rack 3? |
I think for Rails 8, we should aim to drop Rack 2.x - and maybe this helps align on a timeline for dropping features that only exist for Rack 2.x's sake. |
Rails doesn't follow semver as you are thinking, so it is totally fine to introduce breaking changes in 7.1. Of course, 7.1 should support Rack 2 was well, so what I was thinking is not executing |
I don't have a strong opinion on the matter, and if you want to make a follow-up PR (or add commits to this one) that does what you propose, it seems reasonable to me. |
old_perform_deep_munge = ActionDispatch::Request::Utils.perform_deep_munge | ||
ActionDispatch::Request::Utils.perform_deep_munge = false | ||
begin | ||
if Rack.release < "3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make sense to add the opposite test when Rack.release >= 3
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion, but this is consistent with the other checks I added. I slightly prefer if Rack.release < "3"
rather than unless Rack.release >= "3"
. If you want to change this, I would suggest a follow up PR changing it in all the places consistently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't suggesting to use unless but to have an else clause with a rack 3 version of the test. Because here you are basically removing tests.
Anyway, Matthew's concerns would need to be addressed first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is any point repeating tests that already exist in Rack. What do you think?
This is a breaking change in our API caused by an upstream change in our dependency, which I think is an internal implementation detail of AD::Request. ISTM our default response to this test failing should be to fix our implementation to maintain compatibility. Clearly, we had our own tests that explicitly guaranteed this historical behaviour. We might consider other options, but I think that needs a more specific conversation than merely updating our tests to reflect the changed behaviour now observed. |
@matthewd do you recall specifically what downstream system requires this behaviour? Because nothing else (appears to) breaks with this change, just these tests. My suggestion would be, for people who want the current behaviour, to stay on Rack 2. People who want the standard, conforming behaviour can use Rack 3. I did research this in depth and attempted to adjust the standards, but there was push back: whatwg/url#732 The standard is described here: https://url.spec.whatwg.org/#urlencoded-parsing - AFAIK, most browsers follow this standard, including JavaScript interfaces for URL encoded parameters. So, the current Rails behaviour (sans this PR) is deviating from the specification used by (all?) other systems doing URL encoding. As another data point, grape decided to go all in on the standardised encoding and drop the old bespoke implementation/behaviour: whatwg/url#732 (comment) cc @tenderlove whom had a strong opinion on this. |
These tests demonstrate behaviour that we have exposed to applications. The downstream I'm concerned about is the numerous Rails applications that exist in the wild -- so many that some will depend on any established behaviour. This was deliberately implemented, and tested, to work in this way -- it's reasonable for people's applications to have depended on that, and IMO it's not reasonable for us to change it from under them without 1) a specific reason, and 2) a safe deprecation-based transition. That being said, I think the thread you linked supports my opinion that the current behaviour is, in fact, either 1) conforming, or 2) not interested in ever conforming: (emphasis added)
By that definition (which is a true re-statement of the spec as written), our integer conversion makes us permanently non-conforming, and we have no interest in changing that. Alternatively, we might reasonably claim that integer conversion is safe because while it's a non-string value, you still get the standard-consistent value from If we accept that it's reasonable to return an integer [which will |
I'm reticent to change the parameter parsing behavior in Rails. Rails has specific requirements of parameter parsing in order to make it work with with ActiveRecord. We definitely expect nulls to come in via parameters because null can be represented in JSON and we expect that controller actions don't need to change behavior whether the request was made with JSON or via QPs. 69ab91a#diff-c0f0426bb88e0814669f4262e038f0c9a99a72e1c6b4808ffb3c69aceda0065eR947 IMO we should fork the code from Rack and pull it inside Rails. I think it's really fine for Rack to have a WHATWG conforming (or whatever spec we want) QP parser, but in the context of a Rails application, Rails should define how to parse the query string. In Rails applications it's very common to do
💯 I'd be quite happy to change this behavior, but we must at the very least have an escape hatch to allow apps to use the old behavior (and I don't think "please downgrade rack" is the right escape hatch). (Also I'm really paranoid about changing the way parameters are parsed in Rails because we've had so many security issues surrounding parameters + ActiveRecord that I don't really want to touch what we have now) |
Everything that's been said here seems reasonable to me. However, there are also some challenges e.g. there is no way to specify NULL values to https://developer.mozilla.org/en-US/docs/Web/API/URLSearchParams and other similar interfaces because it's not in the spec, thus not supported (unfortunately). So, even if we are comfortable with Rails interpretation, it's not strictly compatible with other web technologies. If you try to specify non-supported values (e.g. I actually agree that the standard not being able to specify NULL is pretty irritating. This is why I opened an issue on the project which specifies what appears to be the defacto standard. But I don't expect to make any real progress - it would be up to browsers, but IMHO, over a long enough time (and I attached some photos to the discussion), "Real World Null Strings" show up and have real world consequences for HTML Forms / Web Applications. So, I'm actually personally in support of being able to represent nulls correctly. As it stands, the best thing we can do, to indicate that a field has a null value, is simply not to specify it. Anyway, that aside, we can solve this problem by pulling in the Rack 2 default query parser in Rails. Maybe someone can suggest the best way to integrate with this at the configuration level so users are aware of what they are getting and can deprecate it if not needed. However, for Rails 7.1 this behaviour makes sense. Regarding parsing parameters and extracting structure, I would personally explore having a type checking/coercion layer in the controller/business logic layer. This probably avoids a lot of the automatic coercion problems/security issues. and can probably be even a little more strict than the current interpretation of fields, e.g. integers, etc. |
Okay, this now uses the old query parser, and all the tests are passing without a version guard. |
9695190
to
0434cf5
Compare
Out of curiosity, is this something that will be maintained in the future, or will have to be deprecated and removed? Should it be a gem instead? |
While I prefer the behaviour that preserves null values, I think it would make sense to follow the standard, therefore, I think we should try to change the standard to support preserving null values. Personally, I think it's okay for this code to live in Rails, but merely including Rails can change the query parser behaviour now. I think this proposed PR is fine for Rails 7.1 (compatibility) but longer term (e.g. Rails 8) we should make it configurable and possibly not the default. |
@tenderlove is it possible to get your feedback on the updated PR? |
|
||
# Replace the Rack query parser with the non-standard Rack 2.2 implementation. | ||
# See <https://github.com/rails/rails/pull/47080> for more details. | ||
Rack::Utils.default_query_parser = ActionDispatch::QueryParser.make_default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying the class in here looks good, but this feels risky.
Instead of changing Rack's default, can we arrange to use our parser when we're doing the parsing (and/or most importantly, in the production of our params
value), but leave Rack::Request to do whatever Rack wants if e.g. a third-party middleware is calling it directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that’s not possible because the input body can be consumed and cached by Rack::Request
. The parameters only get parsed once and it will become non-deterministic whether rack or rails parsers get to it first in your proposed change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, and we can't separately cache our parsed interpretation, because the input isn't rewindable.
By even calling it "default", then, it seems that Rack's API is implying a level of flexibility it can not support (any more?).
An effective global monkey-patch of Rack behaviour as a result of loading our library is not an acceptable outcome... at best that will create a per-project non-determinism that will confound any Rack-downstream library, making them behave differently when loaded in a Rails app.
I think my current best suggestion is that we install a middleware to make the input rewindable [only] if it has a relevant content-type -- after which we should be able to do our own parameter parsing independently of rack, caching separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default_query_parser
is designed for exactly this scenario and there is no problem changing it, but as I said we should probably make this user configurable.
Rack has lots of globally controllable behaviour and that's just the nature of it.
Practically, this change is a strict superset of existing behaviour, so I don't think anything will be broken by it. Can you find any examples?
I think my current best suggestion is that we install a middleware to make the input rewindable [only] if it has a relevant content-type -- after which we should be able to do our own parameter parsing independently of rack, caching separately.
I don't think this is a practical solution, and carries a high level of complexity and technical debt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really like to find a solution to this. It'll be very confusing if parsing is different depending on what files are loaded. I can imagine a scenario where someone is debugging QP parsing in a Rails app (I have had to do this) and they do it via the Rack APIs, but it's not reflective of reality since this file will be loaded in the app but not necessarily your IRB session.
No that’s not possible because the input body can be consumed and cached by Rack::Request.
If this is the case, then no Rack::Request
object should be allowed to be allocated before the request object in Rails, no? Otherwise our request object in Rails would always see empty QPs on POST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the case, then no Rack::Request object should be allowed to be allocated before the request object in Rails, no?
How do you propose to prevent this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you propose to prevent this?
I don't think we can. But if users add a middleware that uses Rack::Request
, surely they'll notice their QPs are gone and not use that middleware?
I guess this raises a larger issues: We basically require anyone using a Rack::Request
object to cache any calculations as subsequent allocations won't work since the body isn't rewindable.
Since allocating a Rack::Request isn't idempotent (it reads the body, mutating the input stream), it seems like we should probably make the request object a singleton, and the type configurable (Rails wants to have it's own request object).
Side note: I am starting to feel great regret for changing the way QPs are parsed in Rack!
FYI if you rebase this PR against main then it'll run the tests against rack 2 and 3, so we can confirm that this PR fixes the failing |
0434cf5
to
6c72200
Compare
Thanks for your feedback, I've just rebased it. |
👍 it doesn't show in the CI status, but there's some red tests that seem related: https://buildkite.com/rails/rails/builds/93278#01860a17-b242-4dbc-a67f-cbc0e27a2f9b |
Ahh thanks, I'll take a look! |
6c72200
to
d92583e
Compare
@ghiculescu thanks for your report, I've fixed the issue. |
I've merged these changes into the main |
Given everything that's been said here, I still think we should consider the option where Rack 2 is the default for existing applications and Rack 3 is the default for new applications isn't a bad approach. There doesn't appear to be any perfect solution that meets everyone's requirements. |
The WhatWG URL spec standardises the behaviour of query strings without `=` symbols, e.g. `a&b` as having empty string values, not nil. Rack 3 follows this behaviour. Pull in a copy of the Rack 2.2 query parser to retain compatibility with existing handling of query string keys without `=` symbols.
85bea4b
to
2657702
Compare
According to rack/rack#2038, we will try a different approach to fix this. Therefore, I will close this PR. Thanks everyone for your feedback. |
We have essentially decided to go with Matthew's implementation, and Rack 3.0.5 was released with his proposed changes. |
The WhatWG URL spec standardises the behaviour of query strings without
=
symbols, e.g.a&b
as having empty string values, not nil. Rack 3 follows this behaviour.Pull in a copy of the Rack 2.2 query parser to retain compatibility with existing handling of query string keys without
=
symbols.