Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zero-width space unicode character at the end of EmailStr is not caught as invalid #3772

Closed
3 tasks done
ignatiusab opened this issue Feb 3, 2022 · 2 comments
Closed
3 tasks done
Assignees
Labels
bug V1 Bug related to Pydantic V1.X

Comments

@ignatiusab
Copy link

Checks

  • I added a descriptive title to this issue
  • I have searched (google, github) for similar issues and couldn't find anything
  • I have read and followed the docs and still think this is a bug

Bug

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

$ python -c "import pydantic.utils; print(pydantic.utils.version_info())"
         pydantic version: 1.9.0
        pydantic compiled: True
             install path: /home/mypc/venv/lib/python3.8/site-packages/pydantic
           python version: 3.8.10 (default, Nov 26 2021, 20:14:08)  [GCC 9.3.0]
                 platform: Linux-5.10.60.1-microsoft-standard-WSL2-x86_64-with-glibc2.29
 optional deps. installed: ['email-validator', 'typing-extensions']

Current behavior:

from pydantic import BaseModel, EmailStr

class EmailModel(BaseModel):
    email: EmailStr

# this input should caught this invalid email but no exception is raised
zzz = EmailModel(email='example@example.com\u200b')
print(zzz.email, repr(zzz.email))

# adding any characters at the end correctly invalidated the input
zzz = EmailModel(email='example@example.com\u200b1')
print(zzz.email, repr(zzz.email))

output:

example@example.com​ 'example@example.com\u200b'
Traceback (most recent call last):
  File "pydantic-play.py", line 137, in <module>
    zzz = EmailModel(email='example@example.com\u200b1')
  File "pydantic/main.py", line 331, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for EmailModel
email
  value is not a valid email address (type=value_error.email)

The underlying email_validator module correctly strip the zero-width space unicode character if no "valid" character at the end.

from email_validator import validate_email
print(repr(validate_email("example@example.com\u200b").email))
print(repr(validate_email("example@example.com\u200b1").email))

output:

'example@example.com'
Traceback (most recent call last):
  File "pydantic-play.py", line 142, in <module>
    print(repr(validate_email("example@example.com\u200b1").email))
  File "/home/mypc/venv/lib/python3.8/site-packages/email_validator/__init__.py", line 231, in validate_email
    domain_part_info = validate_email_domain_part(parts[1])
  File "/home/mypc/venv/lib/python3.8/site-packages/email_validator/__init__.py", line 440, in validate_email_domain_part
    raise EmailSyntaxError(
email_validator.EmailSyntaxError: The domain name example.com1 is not valid. It is not within a valid top-level domain.

Expected behavior:
Either it raises validation error or strip the zero-width space unicode character

@ignatiusab ignatiusab added the bug V1 Bug related to Pydantic V1.X label Feb 3, 2022
@adriangb
Copy link
Member

This is an issue with email-validator, the dependency we use to validate email addresses. It seems like it's been fixed in their new 2.0.0 release. Pydantic V2 will use that version and we'll put out a release for the v1.0.0 pydantic series (#5627) that uses it as well. I believe you can also upgrade the dependency on your current pydantic version and it might work.

from pydantic import BaseModel, EmailStr

class EmailModel(BaseModel):
    email: EmailStr

EmailModel(email='example@example.com\u200b')
"""
    pydantic_core._pydantic_core.ValidationError: 1 validation error for EmailModel
email
  value is not a valid email address: The email address contains unsafe characters: ZERO WIDTH SPACE. [type=value_error, input_value='example@example.com\u200b', input_type=str]
"""

EmailModel(email='example@example.com\u200b1')
"""
    pydantic_core._pydantic_core.ValidationError: 1 validation error for EmailModel
email
  value is not a valid email address: The email address contains unsafe characters: ZERO WIDTH SPACE. [type=value_error, input_value='example@example.com\u200b', input_type=str]
"""

@adriangb adriangb self-assigned this Apr 28, 2023
@adriangb
Copy link
Member

This is now fixed in both V1 and V2. Please update email-validator if using V1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V1 Bug related to Pydantic V1.X
Projects
None yet
Development

No branches or pull requests

2 participants