Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percent sign not encoded when present in query params value #2670

Closed
rslinckx opened this issue Apr 19, 2023 · 7 comments · Fixed by #2671
Closed

Percent sign not encoded when present in query params value #2670

rslinckx opened this issue Apr 19, 2023 · 7 comments · Fixed by #2671

Comments

@rslinckx
Copy link

Since httpx 0.24 the behavior regarding query params changed, and possibly contains a bug.

Use case: passing a pre-signed storage url to some webservice which then downloads the file.

The presigned url will contain some percent-escaped sequences.

With the new httpx 0.24, when something like client.get('http://webservice', params={"u": "http://example.com?q=foo%2Fa"}) is called, the query params are handed to httpx._urls.urlencode:

from httpx._urls import urlencode
urlencode((('u', "http://example.com?q=foo%2Fa"),))
# -> 'u=http%3A//example.com%3Fq%3Dfoo%2Fa'

The resulting query string still has %2F in it, which should have been percent encoded as %252F

Here is the same thing using python urllib.parse (which was used pre-0.24)

from urllib.parse import urlencode
urlencode((('u', "http://example.com?q=foo%2Fa"),))
# -> 'u=http%3A%2F%2Fexample.com%3Fq%3Dfoo%252Fa'

Here the resulting query string is correctly encoded.

Another way to look at it is as follows:

from urllib.parse importparse_qs
from httpx._urls import urlencode
parse_qs( urlencode((('u', "http://example.com?q=foo%2Fa%20b%2Ac"),)))
# -> {'u': ['http://example.com?q=foo/a b*c']}
# Incorrect url decoded from the query string

from urllib.parse import urlencode
parse_qs( urlencode((('u', "http://example.com?q=foo%2Fa%20b%2Ac"),)))
# -> {'u': ['http://example.com?q=foo%2Fa%20b%2Ac']}
# Correct original url decoded from the query string
@rslinckx
Copy link
Author

This is related to the changes here: #2543

@marban
Copy link

marban commented Apr 19, 2023

Also for forward-slashes #2661

@tomchristie
Copy link
Member

Okay, so let's frame the intended behaviour through public API, rather than private API...

def test_param_requires_encoding():
     url = httpx.URL('http://webservice', params={"u": "with spaces"})
     assert str(url) == 'http://webservice?u=with%20spaces'


 def test_param_does_not_require_encoding():
     url = httpx.URL('http://webservice', params={"u": "with%20spaces"})
     assert str(url) == 'http://webservice?u=with%20spaces'


 def test_param_with_existing_escape_requires_encoding():
     url = httpx.URL('http://webservice', params={"u": "http://example.com?q=foo%2Fa"})
     assert str(url) == 'http://webservice?u=http%3A//example.com%3Fq%3Dfoo%252Fa'

The final test here is the one which @rslinckx demonstrates as breaking.

This is because our existing behaviour will percent-escape only when required, and won't escape already-valid percent encoded sequences. However this clearly isn't desirable. What we actually want is percent-escape only when required, and if we do apply escaping, then we should encode any existing '%' characters, even if they're part of a valid escape sequence.

@dlange-hima
Copy link

If understand correctly, the new code still leaves query parameters which look like already percent-escapes as-is?
I think this is not a good idea, because you can't know if it was intentionally escaped or the data coincidentally looks like percent-escaped and really needs to be escaped to be reinterpretet as intentended by the webserver.
It would be nice to control the behavior..

@peku33
Copy link

peku33 commented May 4, 2023

+1 for @dlange-hima

I belive that urlparse(urlencode(x)) shoud be equal to x. This won't be true for parameter values like %20.

With current solution it's actually impossible to send %XY as query parameter key or value.

@marban
Copy link

marban commented May 9, 2023

Has anyone tried this with the Twitter API, i.e. forward slash issues? #2661

@marban
Copy link

marban commented May 21, 2023

So even if I handle the %2F for calling Twitter myself, it still won't work since now the % obviously gets encoded as well.
Also see #2661 (reply in thread)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants