New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be slightly more strict around whitespace parsing #1130
Conversation
@djc How would you consider this breakage? It is quite a bit more minimal than in #807. With a GitHub code search for
Searching for
|
13ebdde
to
dba1b93
Compare
dba1b93
to
c17ae80
Compare
IIUC, this PR is proposing to allow formatting string
Did I understand the proposal of unlimited whitespace correctly? If so, I think this is a very bad idea. |
We currently accept what you describe, and accept no whitespace at all. The change is to require at least one whitespace character. I agree that it doesn't make much sense to accept I just want to make sure we can make your work in #807 stick. But it seems to me it may break a little too many reasonable cases. And we don't have a We can consider the two spaces in |
I don't find the whitespace ambiguousness to be reasonable. Unless proscribed by an RFC, all characters in the format string and value string should match. This loose matching behavior I found surprising and I had to create a workaround for this oddity (jtmoon79/super-speedy-syslog-searcher#6). I am open to allowing a new format specifier that allows for limited optional whitespace (I think "zero or one whitespace characters" is the best choice for this proposed specifier). This provides a good choice and would satisfy both "camps" of users; those that prefer precision and those that prefer whitespace flexibility. |
#807 made a couple of changes that weren't obvious to me on first glance.
I would like to introduce the changes one by one (but in my own way to be honest...)
When parsing,
Item::Space
…That any whitespace in the formatting string would correspond to optional whitespace when parsing was een undocumented feature.
Some examples that were previously accepted:
"%a, %d %b %Y"
accepts"Sat, 09 Aug 2013"
and"Sat,09Aug2013"
"%P %H:%M"
accepts"PM 12:59"
and"PM12:59"
"%H:%M %P"
accepts"12:59 AM"
and"12:59AM"
"%r"
accepts"12:34:60 AM"
,"12:34:60AM"
%c
, the local date and time format, were optional."%F %T %Z"
accepts"2001-07-08 00:34:60 UTC"
and"2001-07-0800:34:60UTC"
I would say that at least in some of these cases a user would not expect the string to parse.
The two cases where I would expect breakage are a space before the AM/PM suffix behind a time (
"%H:%M %P"
), and a space before the timezone abbreviation. To allow creating a formatting string that can parse such cases I made three more changes:%
. It will be encoded asItem::Space("")
.Item::Space("")
, where it would write nothing before. This item could only be created manually before, not parsed from a formatting string. And I don't see a reason someone would write, as it did nothing when formatting.%r
is changed to make the space between the time and AM/PM optional.