New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Call .decode() only for binary data #1060
Conversation
dateutil/parser/_parser.py
Outdated
else: | ||
if getattr(instream, 'decode', None) is not None: | ||
instream = instream.decode() | ||
if isinstance(instream, (bytes, bytearray)) or six.PY3 and isinstance(instream, memoryview): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't be better to do duck-typing and test for the presence of the decode() method?
if hasattr(instream, "decode"):
instream = instream.decode()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you pass in Python 2 a unicode string it will still attempt to decode it by encoding it first. And if that string has a non-ascii char, it'd fail to encode.
dateutil/parser/_parser.py
Outdated
else: | ||
if getattr(instream, 'decode', None) is not None: | ||
instream = instream.decode() | ||
if isinstance(instream, (bytes, bytearray)) or six.PY3 and isinstance(instream, memoryview): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that what @vstinner meant with supporting memoryview was not this. This will fail when a memoryview is passed, as it does not have a decode
method AFAIK.
If you want to add support for memoryview, please add tests for it. Alternatively, feel free to just remove or six.PY3 and isinstance(instream, memoryview)
from this PR.
I must admit that I kinda lost the context of this change after the eleven months. Give me a while or, if you know the solution, feel free to open a new PR based on this one and finish the fix. |
Sure, I'll push a commit with the additions. |
dateutil/parser/_parser.py
Outdated
else: | ||
if getattr(instream, 'decode', None) is not None: | ||
instream = instream.decode() | ||
if isinstance(instream, (bytes, bytearray)) or six.PY3 and isinstance(instream, memoryview): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the same time, I see that commit 8e78646 says "Add support for bytesarray and other bytes-like", not sure if there is any other case we are missing here than bytes
, bytearray
and unicode
.
Let's see if the test suite catches anything.
We can handle both bytes and bytearray in the same way in Python 2 and Python 3. Co-authored-by: Mario Corchero <mariocj89@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but would be good to get another review as I rewrote your commit.
@vstinner could you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Ignore my remark about memoryview, memoryview objects have no decode() method.
Summary of changes
I believe that the
__init__
method might be simplified in this way. In my opinion, it makes sense to calldecode()
method only on bytes/bytearray objects in both major Pythons.In Python 2,
bytes
is builtin alias forstr
sodecode()
call does not happen for unicode.Do you know about any corner case this simplified code is not aware of?
Pull Request Checklist