Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

user/pwd encoding is assumed (hardcoded) to be utf-8 #151

Closed
lafrech opened this issue May 25, 2022 · 5 comments
Closed

user/pwd encoding is assumed (hardcoded) to be utf-8 #151

lafrech opened this issue May 25, 2022 · 5 comments
Assignees
Labels

Comments

@lafrech
Copy link

lafrech commented May 25, 2022

Hi.

First of all, thanks @miguelgrinberg for sharing this lib. We use it to secure our APIs and it works a treat.

We just got caught by an encoding issue today while setting a password containing a ç.

Our interface uses requests and requests uses latin-1 by default to encode user/pwd.

https://github.com/psf/requests/blob/2a6f290bc09324406708a4d404a88a45d848ddf9/requests/auth.py#L56-L60

However, Flask-HTTPAuth assumes utf-8:

username = username.decode('utf-8')
password = password.decode('utf-8')

There doesn't seem to be a consensus about which encoding to use: https://stackoverflow.com/questions/7242316/what-encoding-should-i-use-for-http-basic-authentication.

From this discussion, I understand that requests uses latin-1 for historical reasons and that if a change was to be made, it would have to be in a major version of course, and it would rather be not to assume anything and let the user encode the data himself.

I don't expect Flask-HTTPAuth to change anything at this point. I just thought it might be useful to reach attention on this issue, because this is the kind that tends to occur in production rather than in tests and it can be a bit tricky to debug.

Perhaps Flask-HTTPAuth could expose a parameter or a class attribute to customize the encoding. And/or say a few words about it in the docs.

Honestly I don't know which encoding I would choose if given the choice. It it easy enough for us to modify our UI code to encode the credentials in utf-8 before passing them to requests so problem solved... until we realize that the OpenAPI frontend we're using (Rapidoc) doesn't seem to use utf-8 either. I shall open an issue there as well. (Related Swagger-UI discussion: swagger-api/swagger-ui#2456.)

Thanks for reading. Hopefully this message will be helpful to people facing the same issue.

@miguelgrinberg
Copy link
Owner

This is a tricky issue, the Basic Auth standard does not really have a great solution. The current version of the RFC suggests the server explicitly indicates it accepts UTF-8 credentials by adding the charset argument to the authentication prompt (link). I'm not doing that, and I think I should.

Beyond that, I guess I could implement a latin-1 decode if the utf-8 one fails. Need to investigate if that is feasible.

@miguelgrinberg
Copy link
Owner

@lafrech Can I ask you to install the main branch of this repo and check if the latin1 credentials are now handled correctly? Thanks!

@lafrech
Copy link
Author

lafrech commented May 31, 2022

Sorry about the delay. I confirm the change in 4.7.0 makes my use case work (I can use the "Try" feature of my OpenAPI viewer although it encodes in latin-1).

I can't tell if this is the right thing to do and if it could have unintended side effects.

Ideally, IMHO, the server would set its accepted encoding and the client would have to conform to it. I would have been satisfied with an option here to set the encoding server-side.

But maybe this looser implementation is better and will cause less trouble, assuming people only ever use either utf-8 or latin-1.

In any case, thanks for the quick "fix".

@miguelgrinberg
Copy link
Owner

There is no way for the server to indicate an accepted encoding. If I understand the standard correctly there are two options, saying nothing, or saying that UTF-8 should be used. Either way, the client can ignore what the server asks and send what it wants, so I don't see a point in indicating UTF-8 preference.

@lafrech
Copy link
Author

lafrech commented May 31, 2022

I meant the server could choose either latin-1 and utf-8 and documented this somewhere (although in a non-standard way). But ultimately, clients wouldn't conform and having the server silently accepting both encoding probably makes everyone's life easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants