Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of multipart parsing when request is missing body. #1997

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Expand Up @@ -2,6 +2,16 @@

All notable changes to this project will be documented in this file. For info on how to format all future additions to this file please reference [Keep A Changelog](https://keepachangelog.com/en/1.0.0/).

## [Unreleased]

### SPEC Changes

- `rack.input` is now optional. ([#1997](https://github.com/rack/rack/pull/1997), [@ioquatix])

### Changed

- Improved handling of multipart requests. `rack.input` is now optional, and if missing, will raise an error which includes `module Rack::BadRequest`. Several other exceptions also include this module. ([#1997](https://github.com/rack/rack/pull/1997), [@ioquatix])

## [3.0.3] - 2022-12-07

### Fixed
Expand Down
2 changes: 1 addition & 1 deletion SPEC.rdoc
Expand Up @@ -121,7 +121,7 @@ If the string values for CGI keys contain non-ASCII characters,
they should use ASCII-8BIT encoding.
There are the following restrictions:
* <tt>rack.url_scheme</tt> must either be +http+ or +https+.
* There must be a valid input stream in <tt>rack.input</tt>.
* There may be a valid input stream in <tt>rack.input</tt>.
* There must be a valid error stream in <tt>rack.errors</tt>.
* There may be a valid hijack callback in <tt>rack.hijack</tt>
* The <tt>REQUEST_METHOD</tt> must be a valid token.
Expand Down
1 change: 1 addition & 0 deletions lib/rack.rb
Expand Up @@ -15,6 +15,7 @@
require_relative 'rack/constants'

module Rack
autoload :BadRequest, "rack/bad_request"
autoload :Builder, "rack/builder"
autoload :BodyProxy, "rack/body_proxy"
autoload :Cascade, "rack/cascade"
Expand Down
4 changes: 4 additions & 0 deletions lib/rack/bad_request.rb
@@ -0,0 +1,4 @@
module Rack
module BadRequest
ioquatix marked this conversation as resolved.
Show resolved Hide resolved
end
end
26 changes: 14 additions & 12 deletions lib/rack/lint.rb
Expand Up @@ -56,9 +56,6 @@ def response
raise LintError, "No env given" unless @env
check_environment(@env)

@env[RACK_INPUT] = InputWrapper.new(@env[RACK_INPUT])
@env[RACK_ERRORS] = ErrorWrapper.new(@env[RACK_ERRORS])

## and returns a non-frozen Array of exactly three values:
@response = @app.call(@env)
raise LintError, "response is not an Array, but #{@response.class}" unless @response.kind_of? Array
Expand Down Expand Up @@ -265,11 +262,9 @@ def check_environment(env)
## is reserved for use with the Rack core distribution and other
## accepted specifications and must not be used otherwise.
##

%w[REQUEST_METHOD SERVER_NAME QUERY_STRING SERVER_PROTOCOL
rack.input rack.errors].each { |header|
%w[REQUEST_METHOD SERVER_NAME QUERY_STRING SERVER_PROTOCOL rack.errors].each do |header|
raise LintError, "env missing required key #{header}" unless env.include? header
}
end

## The <tt>SERVER_PORT</tt> must be an Integer if set.
server_port = env["SERVER_PORT"]
Expand Down Expand Up @@ -328,10 +323,17 @@ def check_environment(env)
raise LintError, "rack.url_scheme unknown: #{env[RACK_URL_SCHEME].inspect}"
end

## * There must be a valid input stream in <tt>rack.input</tt>.
check_input env[RACK_INPUT]
## * There may be a valid input stream in <tt>rack.input</tt>.
if rack_input = env[RACK_INPUT]
check_input_stream(rack_input)
@env[RACK_INPUT] = InputWrapper.new(rack_input)
end

## * There must be a valid error stream in <tt>rack.errors</tt>.
check_error env[RACK_ERRORS]
rack_errors = env[RACK_ERRORS]
check_error_stream(rack_errors)
@env[RACK_ERRORS] = ErrorWrapper.new(rack_errors)

## * There may be a valid hijack callback in <tt>rack.hijack</tt>
check_hijack env

Expand Down Expand Up @@ -384,7 +386,7 @@ def check_environment(env)
##
## The input stream is an IO-like object which contains the raw HTTP
## POST data.
def check_input(input)
def check_input_stream(input)
## When applicable, its external encoding must be "ASCII-8BIT" and it
## must be opened in binary mode, for Ruby 1.9 compatibility.
if input.respond_to?(:external_encoding) && input.external_encoding != Encoding::ASCII_8BIT
Expand Down Expand Up @@ -488,7 +490,7 @@ def close(*args)
##
## === The Error Stream
##
def check_error(error)
def check_error_stream(error)
## The error stream must respond to +puts+, +write+ and +flush+.
[:puts, :write, :flush].each { |method|
unless error.respond_to? method
Expand Down
18 changes: 7 additions & 11 deletions lib/rack/mock_request.rb
Expand Up @@ -41,11 +41,6 @@ def string
end
end

DEFAULT_ENV = {
RACK_INPUT => StringIO.new,
RACK_ERRORS => StringIO.new,
}.freeze

def initialize(app)
@app = app
end
Expand Down Expand Up @@ -104,7 +99,7 @@ def self.env_for(uri = "", opts = {})
uri = parse_uri_rfc2396(uri)
uri.path = "/#{uri.path}" unless uri.path[0] == ?/

env = DEFAULT_ENV.dup
env = {}

env[REQUEST_METHOD] = (opts[:method] ? opts[:method].to_s.upcase : GET).b
env[SERVER_NAME] = (uri.host || "example.org").b
Expand Down Expand Up @@ -144,20 +139,21 @@ def self.env_for(uri = "", opts = {})
end
end

opts[:input] ||= String.new
if String === opts[:input]
rack_input = StringIO.new(opts[:input])
else
rack_input = opts[:input]
end

rack_input.set_encoding(Encoding::BINARY)
env[RACK_INPUT] = rack_input
if rack_input
rack_input.set_encoding(Encoding::BINARY)
env[RACK_INPUT] = rack_input

env["CONTENT_LENGTH"] ||= env[RACK_INPUT].size.to_s if env[RACK_INPUT].respond_to?(:size)
env["CONTENT_LENGTH"] ||= env[RACK_INPUT].size.to_s if env[RACK_INPUT].respond_to?(:size)
end

opts.each { |field, value|
env[field] = value if String === field
env[field] = value if String === field
}

env
Expand Down
10 changes: 9 additions & 1 deletion lib/rack/multipart.rb
Expand Up @@ -6,16 +6,24 @@
require_relative 'multipart/parser'
require_relative 'multipart/generator'

require_relative 'bad_request'

module Rack
# A multipart form data parser, adapted from IOWA.
#
# Usually, Rack::Request#POST takes care of calling this.
module Multipart
MULTIPART_BOUNDARY = "AaB03x"

class MissingInputError < StandardError
include BadRequest
end

class << self
def parse_multipart(env, params = Rack::Utils.default_query_parser)
io = env[RACK_INPUT]
unless io = env[RACK_INPUT]
raise MissingInputError, "Missing input stream!"
end

if content_length = env['CONTENT_LENGTH']
content_length = content_length.to_i
Expand Down
18 changes: 14 additions & 4 deletions lib/rack/multipart/parser.rb
Expand Up @@ -3,18 +3,28 @@
require 'strscan'

require_relative '../utils'
require_relative '../bad_request'

module Rack
module Multipart
class MultipartPartLimitError < Errno::EMFILE; end
class MultipartPartLimitError < Errno::EMFILE
include BadRequest
end

# Use specific error class when parsing multipart request
# that ends early.
class EmptyContentError < ::EOFError; end
class EmptyContentError < ::EOFError
include BadRequest
end

# Base class for multipart exceptions that do not subclass from
# other exception classes for backwards compatibility.
class Error < StandardError; end
class BoundaryTooLongError < StandardError
include BadRequest
end

Error = BoundaryTooLongError
deprecate_constant :Error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecation at this point is too aggressive. Otherwise, how do you write code that rescues this error and works in both Rack 3.0 and 3.1 (and ideally, Rack 2)?

Copy link
Member Author

@ioquatix ioquatix Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error AFAIK doesn't exist in Rack 2.x?

It was introduced here: d866dc4

Deprecations only show up with warnings enabled. We can keep it for one more minor version if you think that makes sense. How would you like to deprecate this?

I think going forward, people should probably rescue Rack::BadRequest rather than specific multipart/query_parser errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep it until Rack 4.

Copy link
Member Author

@ioquatix ioquatix Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question was when should we deprecate it? I believe it can be removed in Rack 4, are you suggesting we should deprecate it it in Rack 4?


EOL = "\r\n"
MULTIPART = %r|\Amultipart/.*boundary=\"?([^\";,]+)\"?|ni
Expand Down Expand Up @@ -96,7 +106,7 @@ def self.parse(io, content_length, content_type, tmpfile, bufsize, qp)
if boundary.length > 70
# RFC 1521 Section 7.2.1 imposes a 70 character maximum for the boundary.
# Most clients use no more than 55 characters.
raise Error, "multipart boundary size too large (#{boundary.length} characters)"
raise BoundaryTooLongError, "multipart boundary size too large (#{boundary.length} characters)"
end

io = BoundedIO.new(io, content_length) if content_length
Expand Down
14 changes: 11 additions & 3 deletions lib/rack/query_parser.rb
@@ -1,22 +1,30 @@
# frozen_string_literal: true

require_relative 'bad_request'

module Rack
class QueryParser
DEFAULT_SEP = /[&] */n
COMMON_SEP = { ";" => /[;] */n, ";," => /[;,] */n, "&" => /[&] */n }

# ParameterTypeError is the error that is raised when incoming structural
# parameters (parsed by parse_nested_query) contain conflicting types.
class ParameterTypeError < TypeError; end
class ParameterTypeError < TypeError
include BadRequest
end

# InvalidParameterError is the error that is raised when incoming structural
# parameters (parsed by parse_nested_query) contain invalid format or byte
# sequence.
class InvalidParameterError < ArgumentError; end
class InvalidParameterError < ArgumentError
include BadRequest
end

# ParamsTooDeepError is the error that is raised when params are recursively
# nested over the specified limit.
class ParamsTooDeepError < RangeError; end
class ParamsTooDeepError < RangeError
include BadRequest
end

def self.make_default(param_depth_limit)
new Params, param_depth_limit
Expand Down
24 changes: 18 additions & 6 deletions lib/rack/request.rb
Expand Up @@ -469,9 +469,10 @@ def content_charset
# content-type header is provided and the request_method is POST.
def form_data?
type = media_type
meth = get_header(RACK_METHODOVERRIDE_ORIGINAL_METHOD) || get_header(REQUEST_METHOD)

(meth == POST && type.nil?) || FORM_DATA_MEDIA_TYPES.include?(type)
request_method = get_header(RACK_METHODOVERRIDE_ORIGINAL_METHOD) || get_header(REQUEST_METHOD)

(request_method == POST && type.nil?) || FORM_DATA_MEDIA_TYPES.include?(type)
end

# Determine whether the request body contains data by checking
Expand Down Expand Up @@ -501,10 +502,20 @@ def POST
end

begin
if get_header(RACK_INPUT).nil?
raise "Missing rack.input"
elsif get_header(RACK_REQUEST_FORM_INPUT) == get_header(RACK_INPUT)
get_header(RACK_REQUEST_FORM_HASH)
rack_input = get_header(RACK_INPUT)

# If the form hash was already memoized:
if form_hash = get_header(RACK_REQUEST_FORM_HASH)
# And it was memoized from the same input:
if get_header(RACK_REQUEST_FORM_INPUT).equal?(rack_input)
return form_hash
end
end

# Otherwise, figure out how to parse the input:
if rack_input.nil?
set_header RACK_REQUEST_FORM_INPUT, nil
set_header(RACK_REQUEST_FORM_HASH, {})
elsif form_data? || parseable_data?
unless set_header(RACK_REQUEST_FORM_HASH, parse_multipart)
form_vars = get_header(RACK_INPUT).read
Expand All @@ -516,6 +527,7 @@ def POST
set_header RACK_REQUEST_FORM_VARS, form_vars
set_header RACK_REQUEST_FORM_HASH, parse_query(form_vars, '&')
end

set_header RACK_REQUEST_FORM_INPUT, get_header(RACK_INPUT)
get_header RACK_REQUEST_FORM_HASH
else
Expand Down
8 changes: 6 additions & 2 deletions test/spec_lint.rb
Expand Up @@ -13,8 +13,12 @@
[200, { "content-type" => "test/plain", "content-length" => "3" }, ["foo"]]
end

def env(*args)
Rack::MockRequest.env_for("/", *args)
def env(options = {})
unless options.key?(:input)
options[:input] = String.new
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a smell. If we are changing rack.input to not be required, we shouldn't have the tests default to using it with an empty string.

Copy link
Member Author

@ioquatix ioquatix Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lot's of tests fail without it because they assume that it's an empty string. We can rewrite those tests. However if we change that, it will make backporting (semi-unrelated changes) harder if the tests overlap since they will have diverged. This is about changing as few tests as possible - we can always rewrite the tests later once backporting is no longer expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should give you a rough indication of how much code this change will break. If we aren't willing to modify rack's own tests to fully implement the change, it's hard to argue the benefits of the change are worth the cost.

Copy link
Member Author

@ioquatix ioquatix Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we aren't willing to modify rack's own tests to fully implement the change

Well, we did modify rack's own tests, and it was just in one place since the test fixture uses this and these tests assume an empty string. That shows minimal changes were required in this specific case.

end

Rack::MockRequest.env_for("/", options)
end

it "pass valid request" do
Expand Down
13 changes: 8 additions & 5 deletions test/spec_mock_request.rb
Expand Up @@ -14,7 +14,10 @@
app = Rack::Lint.new(lambda { |env|
req = Rack::Request.new(env)

env["mock.postdata"] = env["rack.input"].read
if input = env["rack.input"]
env["mock.postdata"] = input.read
end

if req.GET["error"]
env["rack.errors"].puts req.GET["error"]
env["rack.errors"].flush
Expand Down Expand Up @@ -46,7 +49,7 @@
end

it "should handle a non-GET request with both :input and :params" do
env = Rack::MockRequest.env_for("/", method: :post, input: nil, params: {})
env = Rack::MockRequest.env_for("/", method: :post, input: "", params: {})
env["PATH_INFO"].must_equal "/"
env.must_be_kind_of Hash
env['rack.input'].read.must_equal ''
ioquatix marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -71,7 +74,7 @@
env["PATH_INFO"].must_equal "/"
env["SCRIPT_NAME"].must_equal ""
env["rack.url_scheme"].must_equal "http"
env["mock.postdata"].must_be :empty?
env["mock.postdata"].must_be_nil
end

it "allow GET/POST/PUT/DELETE/HEAD" do
Expand Down Expand Up @@ -194,7 +197,7 @@
env["QUERY_STRING"].must_include "baz=2"
env["QUERY_STRING"].must_include "foo%5Bbar%5D=1"
env["PATH_INFO"].must_equal "/foo"
env["mock.postdata"].must_equal ""
env["mock.postdata"].must_be_nil
end

it "accept raw input in params for GET requests" do
Expand All @@ -204,7 +207,7 @@
env["QUERY_STRING"].must_include "baz=2"
env["QUERY_STRING"].must_include "foo%5Bbar%5D=1"
env["PATH_INFO"].must_equal "/foo"
env["mock.postdata"].must_equal ""
env["mock.postdata"].must_be_nil
end

it "accept params and build url encoded params for POST requests" do
Expand Down
5 changes: 4 additions & 1 deletion test/spec_mock_response.rb
Expand Up @@ -14,7 +14,10 @@
app = Rack::Lint.new(lambda { |env|
req = Rack::Request.new(env)

env["mock.postdata"] = env["rack.input"].read
if input = env["rack.input"]
env["mock.postdata"] = input.read
end

if req.GET["error"]
env["rack.errors"].puts req.GET["error"]
env["rack.errors"].flush
Expand Down
12 changes: 9 additions & 3 deletions test/spec_multipart.rb
Expand Up @@ -30,16 +30,22 @@ def multipart_file(name)
end

it "return nil if content type is not multipart" do
env = Rack::MockRequest.env_for("/",
"CONTENT_TYPE" => 'application/x-www-form-urlencoded')
env = Rack::MockRequest.env_for("/", "CONTENT_TYPE" => 'application/x-www-form-urlencoded', :input => "")
Rack::Multipart.parse_multipart(env).must_be_nil
end

it "raises exception if boundary is too long" do
env = Rack::MockRequest.env_for("/", multipart_fixture(:content_type_and_no_filename, "A"*71))
lambda {
Rack::Multipart.parse_multipart(env)
}.must_raise Rack::Multipart::Error
}.must_raise Rack::Multipart::BoundaryTooLongError
end

it "raises a bad request exception if no body is given but content type indicates a multipart body" do
env = Rack::MockRequest.env_for("/", "CONTENT_TYPE" => 'multipart/form-data; boundary=BurgerBurger', :input => nil)
lambda {
Rack::Multipart.parse_multipart(env)
}.must_raise Rack::BadRequest
end

it "parse multipart content when content type present but disposition is not" do
Expand Down