Skip to content

[12.x] Validate paths for UTF-8 characters #54370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 28, 2025

Conversation

Jubeki
Copy link
Contributor

@Jubeki Jubeki commented Jan 27, 2025

TLDR: Currently if there invalid UTF-8 characters in the URL, the framework will fail to call some wildcard and fallback routes. This PR fixes this issue by adding a check for invalid UTF-8 characters in the URL. This PR also adds a test for this issue.

The Problem

Currently if a Person visits a URL like http://localhost:8000/%C0, the framework will fail to call wildcards and fallback routes. This is because the URL-Path contains the invalid UTF-8 characters %C0. This is an encoded character and would be represented as the following binary number: 1100 0000. This is not a valid UTF-8 character, as the first bit of a UTF-8 character must be 0 for a single byte character, 110 for a two-byte character, 1110 for a three-byte character, and 1111 0 for a four-byte character. That means that the URL-Path would expect a two-byte character, but the character is only one byte.

The URIValidator uses preg_match which can return int|false (0 - not matching, 1 - matching, false - error).
So error and not matching, both have falsy values, which means it will never call the wildcard or fallback routes.

The issue with that is for example that statamic, or any other application which registeres one of the following routes, these routes are never called and not even the fallback route is called.

Route::get('/{path}', fn() => 'Test Wildcard')->where('path', '.*');
Route::fallback(fn() => 'Test Fallback');

Some applications use the fallback route, to handle the 404 page with more information, which would then not be available in the 404 page. As these variables are not available the rendering of the 404 page results in an error, resulting in a 500 error.

The solutions

One solution is this PR, which validates the URL-Path and checks if there are invalid UTF-8 characters in the URL. If there are invalid UTF-8 characters in the URL, the framework will abort with 400 malformed url.

One other solution would be a whole rewrite of the fallback handling, which would be a bigger change and would require more time to implement. But then the fallback route would always be called, even if there are invalid UTF-8 characters in the URL.

Why the master / 12.x branch

As there are many applications in the wild and I can not be sure, that this would not break some applications, therefor I opted for the master branch.

Related issues / discussions:

Credits

Thanks to @crynobone for discussing this issue.

Also mentioning @duncanmcclean for the related statamic issue.

Copy link

Thanks for submitting a PR!

Note that draft PR's are not reviewed. If you would like a review, please mark your pull request as ready for review in the GitHub user interface.

Pull requests that are abandoned in draft may be closed due to inactivity.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@Jubeki Jubeki changed the title Add validate utf-8 path middleware [12.x] Validate paths for valid UTF-8 characters Jan 27, 2025
@Jubeki Jubeki changed the title [12.x] Validate paths for valid UTF-8 characters [12.x] Validate paths for UTF-8 characters Jan 27, 2025
@Jubeki Jubeki marked this pull request as ready for review January 27, 2025 12:46
*/
public function __construct()
{
parent::__construct(400, 'Malformed URL.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you want to use the \Symfony\Component\HttpFoundation\Response status code constants?

Suggested change
parent::__construct(400, 'Malformed URL.');
parent::__construct(Response::HTTP_BAD_REQUEST, 'Malformed URL.');

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave this decision up to the maintainers, it doesn't really matter to me.

@taylorotwell taylorotwell merged commit 140d2cf into laravel:master Jan 28, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants