New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prevent unnessary b' prefix in parse_isodate exception messages #1122
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR! It's almost there.
dateutil/parser/isoparser.py
Outdated
@@ -159,7 +159,7 @@ def parse_isodate(self, datestr): | |||
components, pos = self._parse_isodate(datestr) | |||
if pos < len(datestr): | |||
raise ValueError('String contains unknown ISO ' + | |||
'components: {}'.format(datestr)) | |||
'components: {}'.format(datestr.decode())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this is happening as the function uses _takes_ascii
. Can you pass "ascii" as the codec to decode the bytes? In case the default encoding is another one.
dateutil/test/test_isoparser.py
Outdated
isoparser().parse_isodate(isostr) | ||
|
||
# ensure the error message does not contain b' prefixes | ||
assert "b'{}'".format(isostr) not in str(excinfo.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than checking for the not presence of b'X'
, would you mind adding a new test case for this (feel free to copy this one) and validate the fully generated exception str?
@mariocj89 Thanks so much for the good feedback! I've added fixes for the things you've mentioned. Let me know if I need to make any additional tweaks. |
dateutil/parser/isoparser.py
Outdated
@@ -159,7 +159,7 @@ def parse_isodate(self, datestr): | |||
components, pos = self._parse_isodate(datestr) | |||
if pos < len(datestr): | |||
raise ValueError('String contains unknown ISO ' + | |||
'components: {}'.format(datestr)) | |||
'components: {}'.format(datestr.decode('ascii'))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think it'd be even easier to read if we passed here: 'components: {!r}'
.
That will put '
around the text that failed (Say for example that we have some invisible characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed: f28e639
Definitely agree with removing the quotes making it more difficult to detect things like whitespace.
As the function is wrapped around a decorator that converts the input to bytes, when an exception is raised that includes the value, it contains the value as bytes (b"<val>"). To provide the expected str in Python 3, decode that to unicode. This will result in `u""` for Python 2 callers, but given that we plan to drop support for Python 2 in the short term, it feels acceptable.
Thanks, I've added a fixup for your tests to pass in Py2. I'll land this later today. |
Summary of changes
I started working on #383 by adding type hints to this branch: https://github.com/dateutil/dateutil/compare/master...pawl:monkeytype_merge_pyi?expand=1#diff-fba91b4117b31a0a29957adeb621fe79541fdb2424d10359e444df0d1cea2473R4
When I run mypy to see if those type hints work, I get a bunch of errors (a lot of which are probably my fault): https://gist.github.com/pawl/7ad1e9ce650b3241cc106d238d7edf04
This PR fixes one of those errors:
That error is saying
isoparser().parse_isotime('2014-04-19T')
will give an error that looks like this with theb'
prefix around the input:This is happening because the
_takes_ascii
decorator turns the string input into a bytestring, and python 3 represents bytestrings withb'
when you pass them into.format()
.This PR fixes the issue by using
.decode()
on the bytestring before using it with.format()
for the error message.Pull Request Checklist