Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When parsing a multi-part POST, retain original pairs #2088

Merged
merged 1 commit into from Jul 17, 2023

Conversation

matthewd
Copy link
Contributor

The input is not guaranteed to be rewindable, and the default parameter expansion loses details of the parameter names.

For form-encoded data, we (already) retain the full string: it contains only simple key/value pairs, and will be of a manageable size.

Multipart data may contain uploaded files, so we don't wish to retain the full input string -- that's why we dropped the rewindability requirement on the input stream. Instead, we retain an array of two-element arrays representing the "raw" key-value pairs as we parsed the stream. This minimizes additional memory use, because the values are the same objects we use in the params hash, while still allowing an interested consumer to apply their own logic to parameter name interpretation.


@jeremyevans @ioquatix I believe this is the minimal version of what we discussed at Kaigi. Apologies for the delay.

For a consumer to use it in this state, they'd need to call Rack::Request#POST, relying on its side-effect, then manually pull rack.request.form_pairs. Neither of those steps feels like a great API, so I think it would make sense to provide a body_param_list-like method as in #2038, but it might be easier to discuss that in a separate follow-up PR, if this seems independently reasonable as the direct implementation / extra retained values?

The input is not guaranteed to be rewindable, and the default parameter
expansion loses details of the parameter names.

For form-encoded data, we (already) retain the full string: it contains
only simple key/value pairs, and will be of a manageable size.

Multipart data may contain uploaded files, so we don't wish to retain
the full input string -- that's why we dropped the rewindability
requirement on the input stream. Instead, we retain an array of
two-element arrays representing the "raw" key-value pairs as we parsed
the stream. This minimizes additional memory use, because the values are
the same objects we use in the params hash, while still allowing an
interested consumer to apply their own logic to parameter name
interpretation.
Copy link
Contributor

@jeremyevans jeremyevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the discussion regarding this at RubyKaigi, it helped me better understand the issue.

This looks good to me. I have one inline comment, but it can be done after this is merged if you and @ioquatix agree.

lib/rack/request.rb Show resolved Hide resolved
@ioquatix
Copy link
Member

I discussed this with Matthew.

There is an expected follow up PR, creating a method, like form_pairs with the following possible implementation:

def form_pairs
  if form_data? || parseable_data?
    if pairs = Rack::Multipart.parse_multipart(env, Rack::Multipart::ParamList)
      set_header RACK_REQUEST_FORM_PAIRS, pairs
    end
  end

  get_header RACK_REQUEST_FORM_PAIRS # cached value.
end

And then in theory POST is form_pairs |> expand_param_pairs, + hash caching

@ioquatix ioquatix merged commit 3855d1d into main Jul 17, 2023
25 of 27 checks passed
@ioquatix ioquatix deleted the preserve-multipart-params branch July 17, 2023 06:23
JunichiIto pushed a commit to JunichiIto/rack that referenced this pull request Jul 17, 2023
The input is not guaranteed to be rewindable, and the default parameter
expansion loses details of the parameter names.

For form-encoded data, we (already) retain the full string: it contains
only simple key/value pairs, and will be of a manageable size.

Multipart data may contain uploaded files, so we don't wish to retain
the full input string -- that's why we dropped the rewindability
requirement on the input stream. Instead, we retain an array of
two-element arrays representing the "raw" key-value pairs as we parsed
the stream. This minimizes additional memory use, because the values are
the same objects we use in the params hash, while still allowing an
interested consumer to apply their own logic to parameter name
interpretation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants