Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix encoding consistency with sorted and non-sorted URL parameters #158

Merged
merged 3 commits into from Jan 18, 2022

Conversation

ludofischer
Copy link
Contributor

Fix #149

Currently normalize-url encodes URL parameters when sortQueryParameters is true and leaves them as they are when not. This behaviour is confusing as a user would not expect sorting the parameters to affect the encoding. This PR tries to ensure the search parameters are always decoded, since normalize-url already always decodes the pathname and does not intend to sanitize the URL;

urlObject.pathname = decodeURI(urlObject.pathname);

@ludofischer ludofischer marked this pull request as ready for review November 11, 2021 18:13
@sindresorhus
Copy link
Owner

This really sounds like a WHATWG URL bug. It should always be consistent about encoding whether or not .sort() is called. If you can only reproduce this in Node.js, I recommend opening an issue on https://github.com/nodejs/node, otherwise https://github.com/whatwg/url

index.js Outdated
@@ -208,6 +208,11 @@ export default function normalizeUrl(urlString, options) {
urlObject.searchParams.sort();
}

// Decode query parameters
try {
urlObject.search = decodeURIComponent(urlObject.search);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the search components are decoded when .sort() is not called, this will potentially decode encoded character where the encoding is meant to be preserved. You could potentially only decode when !options.sortQueryParameters, but I think the encoding inconsistency is a bug in the URL implementation and it would be risky to depend on it staying like that.

@ludofischer
Copy link
Contributor Author

This really sounds like a WHATWG URL bug. It should always be consistent about encoding whether or not .sort() is called. If you can only reproduce this in Node.js, I recommend opening an issue on https://github.com/nodejs/node, otherwise https://github.com/whatwg/url

I can reproduce it in Firefox and Chrome too! And it seems the issue is well known: whatwg/url#491:

And any attempts to manipulate it will change the contents of your URL's query string in unintended ways

@ludofischer
Copy link
Contributor Author

See also whatwg/url#478 where they decide to keep the current searchParams behaviour.
Given that searchParams modifies the input by design and the working group does not seem to want to change it, would it be preferable to avoid using it altogether?

@sindresorhus
Copy link
Owner

Given that searchParams modifies the input by design and the working group does not seem to want to change it, would it be preferable to avoid using it altogether?

There's no easy way to avoid it as we need to keep the sort support and I don't want to bring in another library just for that.

@sindresorhus
Copy link
Owner

You could potentially only decode when !options.sortQueryParameters

I guess this is the best option then.

Only decode when the parameters have been sorted.
@ludofischer
Copy link
Contributor Author

You could potentially only decode when !options.sortQueryParameters

I guess this is the best option then.

I've pushed an additional commit to check if the parameters have been sorted before decoding.

@sindresorhus sindresorhus changed the title fix: consistent output between sorted and non-sorted URL parameters Fix encoding consistency with sorted and non-sorted URL parameters Jan 18, 2022
@sindresorhus sindresorhus merged commit c42f53b into sindresorhus:main Jan 18, 2022
@ludofischer ludofischer deleted the fix-decode-params branch January 18, 2022 15:57
@psamusev
Copy link

psamusev commented Apr 20, 2022

The current behavior of
urlObject.pathname = decodeURI(urlObject.pathname);
doesn't work as expected. Since URL automatically always encode any value that we set, according to https://developer.mozilla.org/en-US/docs/Web/API/URL

URLs are encoded according to the rules found in [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986)

@sindresorhus
Copy link
Owner

@psamusev It's better to open a new issue instead of commenting on an old pull request. Submitting a failing test would be even better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sortQueryParameters encodes the query string
3 participants