Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make StrftimeItems::new return Result #902

Conversation

jaggededgedjustice
Copy link

This makes the date_time.format() function return a Result instead of panicking when passed an invalid format string. If any part of the format string is invalid an error is returned.
This is a breaking change for the api.

Fixes #623
closes #614

Copy link
Contributor

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks for fixing this up!

src/format/strftime.rs Outdated Show resolved Hide resolved
@jaggededgedjustice jaggededgedjustice force-pushed the handle-invalid-format-strings branch 2 times, most recently from ffdcb46 to 25e8396 Compare December 13, 2022 20:18
@jaggededgedjustice
Copy link
Author

Ok, I have things mostly working except for some of the configurations. I'm not sure how best to handle the failures though.

Copy link
Contributor

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is looking great!

@@ -184,11 +186,6 @@ use alloc::vec::Vec;
use super::{locales, Locale};
use super::{Fixed, InternalFixed, InternalInternal, Item, Numeric, Pad};

#[cfg(feature = "unstable-locales")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the cause of the test failures. Note that, without unstable-locales, this is a slice, not a Vec. chrono without default features (and in WASM scenarios) has included some formatting functionality in that case, but Vec is not available there. So we have to find a way to use a &[Item<'a>] in StrftimeItems.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a problem. I'm assuming that other collection types like LinkedList are also unavailable, and I don't see a good way to do this without a resizeable collection.

We could go back to the system where the iterator generates elements as we go and just run through the string once at creation time to check it's all valid. That feels wasteful though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having spent some time thinking about it, I think we do want to have StrftimeItems::new return a Result but I can't see any way to use an array as we can't give a size at compile time.
So my current plan is to go back to the system of storing the format string as a str and processing it as the iterator is called but iterate through the format string once at creation time to check for problems. Do you have any objections or better ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to look at it again tomorrow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No better ideas, so let's do it that way for now.

let d_fmt = StrftimeItems::new(locales::d_fmt(locale)).collect();
let d_t_fmt = StrftimeItems::new(locales::d_t_fmt(locale)).collect();
let t_fmt = StrftimeItems::new(locales::t_fmt(locale)).collect();
pub fn new(s: &'a str) -> Result<StrftimeItems<'a>, ParseError> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should have a new function parse or similar for this, and keep the existing function as a lazy iterator? My thinking for this is that if we just want to validate strftime strings, the iterator is more efficient as we can bail out earlier if we hit an error. Alternatively, we may want to collect all the errors rather than the first one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for just validating a strftime string? I also think it's pretty unusual for things to collect all errors instead of bailing on the first one (for example, the FromIterator impl for Result).

@jaggededgedjustice jaggededgedjustice force-pushed the handle-invalid-format-strings branch 3 times, most recently from db8c8e7 to fac117e Compare January 22, 2023 16:20
@jaggededgedjustice jaggededgedjustice changed the title Make date_time.format() return Result Make StrftimeItems::new return Result Jan 24, 2023
@jaggededgedjustice
Copy link
Author

can someone explain the failures? The error message seems to be pointing to a line that doesn't exist in the source.

@Colonial-Dev
Copy link

Colonial-Dev commented Feb 20, 2023

If anybody needs a workaround until this is merged, here's what I cooked up:

// Workaround to avoid panicking when the user-provided format string is invalid.
// Will be obsolete once https://github.com/chronotope/chrono/pull/902 is merged.
std::panic::set_hook(Box::new(|_| ()));
let formatted = std::panic::catch_unwind(|| {
    datetime.format(&description).to_string()
})?;
let _ = std::panic::take_hook();

A dummy panic hook is necessary to suppress the panic message. I don't think this actually incurs an allocation (|_: &PanicInfo| () is a ZST) so it should be fine even in a hot loop.

(Paging #956 as well.)

@esheppa
Copy link
Collaborator

esheppa commented Feb 20, 2023

@SomewhereOutInSpace - you can also use a workaround as mentioned here #614 (comment). The main points are using StrftimeItems::new() and the write! macro depending on the task, both cases exposing the error as a regular Result rather than a panic.

@esheppa
Copy link
Collaborator

esheppa commented Feb 20, 2023

@jaggededgedjustice - the errors look mainly due to cases where what was previously a StrftimeItems or DelayedFormat is now encapsulated in a result which first needs to be handeled

@jaggededgedjustice
Copy link
Author

@esheppa the bit I'm struggling with is finding the line that's throwing the error. Consider, https://github.com/chronotope/chrono/actions/runs/3980518900/jobs/6823655339#logs which says

error[E0308]: mismatched types
    --> src/naive/date.rs:1177:42
     |
1177 |         self.format_localized_with_items(StrftimeItems::new_with_locale(fmt, locale), locale)
     |              --------------------------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected struct `StrftimeItems`, found enum `Result`
     |              |
     |              arguments to this function are incorrect
     |
     = note: expected struct `StrftimeItems<'_>`
                  found enum `Result<StrftimeItems<'_>, ParseError>`

The error makes sense for the line shown, but I can't find where that line is. The error message points to src/naive/date.rs line 1177 except that line is a comment: https://github.com/jaggededgedjustice/chrono/blob/fac117e8bc4ff092300abd6dc5336156eeb3f137/src/naive/date.rs#L1177

and a search shows no call to new_with_locale that doesn't handle the new Return value:

grep -r new_with_locale
chrono-handle-invalid-format-strings/src/format/strftime.rs:    pub fn new_with_locale(s: &'a str, locale: Locale) -> Result<StrftimeItems<'a>, ParseError> {
chrono-handle-invalid-format-strings/src/format/mod.rs:    pub fn new_with_locale(
chrono-handle-invalid-format-strings/src/datetime/mod.rs:        Ok(self.format_localized_with_items(StrftimeItems::new_with_locale(fmt, locale)?, locale))
chrono-handle-invalid-format-strings/src/date.rs:        Ok(self.format_localized_with_items(StrftimeItems::new_with_locale(fmt, locale)?, locale))

So I have no idea what's going on, unless maybe the worker that ran the last build had an old copy of some code.

Copy link
Contributor

@jtmoon79 jtmoon79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs additional test cases for the path where a ParseError is returned.

@jaggededgedjustice
Copy link
Author

This needs additional test cases for the path where a ParseError is returned.

Do you want a test for every function that can now return a ParseError? There is test_strftime_items() in src/format/strftime.rs which checks that the constructor for StrftimeItems will return an error type when given invalid format strings.

@jtmoon79
Copy link
Contributor

jtmoon79 commented Apr 8, 2023

This needs additional test cases for the path where a ParseError is returned.

Do you want a test for every function that can now return a ParseError? There is test_strftime_items() in src/format/strftime.rs which checks that the constructor for StrftimeItems will return an error type when given invalid format strings.

If you would please, I'd really appreciate it. While chrono does have decent tests for the public functions, it has few tests on the underlying private functions the comprise the "backend" of chrono. It would be a good project "quality confidence" improvement (Test Engineer lingo) to require tests for private functions.

Hopefully @djc and @esheppa agree. Final decisions should defer to them.

@esheppa
Copy link
Collaborator

esheppa commented Apr 9, 2023

@jtmoon79 that does seem reasonable overall, but as far as I can see there aren't any new (non testing) private functions in this PR? However I think it would be nice to have some basic tests on the StrftimeItems::new* functions now that they are fallible

@jtmoon79
Copy link
Contributor

jtmoon79 commented Apr 9, 2023

@jtmoon79 that does seem reasonable overall, but as far as I can see there aren't any new (non testing) private functions in this PR? However I think it would be nice to have some basic tests on the StrftimeItems::new* functions now that they are fallible

Good point. I had too many PR reviews in my head when I wrote that 🥴

@jaggededgedjustice
Copy link
Author

Are you happy with the current tests that rely on is_err(), or should they check the specific error that gets returned?


#[test]
fn test_format() {
assert!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename to fn test_date_format_err() and add a few more tests within it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions:

  • "" (not sure if this should error, you'll have to try it)
  • "%X"
  • "%%%"
  • "%Y%"
  • " %Y% "
  • "%ぁ"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many test cases do think would be required? Personally I'd leave it with just the 1 case as the error is just being passed through from StrftimeItems::new() which has test_strftime_items to check a wide range of inputs so I don't see the benefit of duplicating the check here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's reasonable.

@@ -1921,3 +1921,16 @@ where
assert!(from_str(r#"{"date":{"ymdf":20},"time":{"secs":0,"frac":0}}"#).is_err());
assert!(from_str(r#"null"#).is_err());
}

#[test]
fn test_format() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename to fn test_datetime_format_err() and add a few more tests within it.

Suggestions:

  • "" (not sure if this should error, you'll have to try it)
  • "%%%"
  • "%Y%"
  • " %Y% "
  • "%ぁ"


#[test]
fn test_format() {
assert!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's reasonable.

@@ -1381,3 +1381,11 @@ where
assert!(from_str(r#"{"secs":0,"frac":0}"#).is_err());
assert!(from_str(r#"null"#).is_err());
}

#[test]
fn test_format() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this to test_time_format_err.

@jaggededgedjustice jaggededgedjustice force-pushed the handle-invalid-format-strings branch 2 times, most recently from a39ca78 to 82df95c Compare May 28, 2023 10:54
@pitdicker
Copy link
Collaborator

pitdicker commented May 29, 2023

Edit: (I have to think a bit more before writing this comment)

@pitdicker
Copy link
Collaborator

pitdicker commented May 31, 2023

While I agree this issue is worth fixing, I am not sure the solution here is the best.

My concern are the ergonomics for library users. Also this solution requires parsing the format string twice, which is unfortunate.


If I understand it right, the only combination a panic is designed to happen with the current formatting and parsing code is in this snippet:

let date = NaiveDate::from_ymd_opt(2023, 5, 31).unwrap();
let date_str = date.format("%Y-%M-%D %").to_string();

The actual panic doesn't happen in any code that is part of chrono. It happens in std::string::ToString::to_string.
When std::fmt::Display::fmt returns an error, to_string can't pass on the error but panics.

If you use Display::fmt with a method that can return an error, it all works as it should:

let date = NaiveDate::from_ymd_opt(2023, 5, 31).unwrap();
let date_str = format!("{}", date.format("%Y-%M-%D %"))?;
/// or:
println!("{}", date.format("%Y-%M-%D %"))?;

This PR requires extra error handling for common cases such as println and parsing that were fine without it.

As an alternative (that also works on 0.4.x): what if we had convenience methods format_to_string that directly return fmt::Result<String>? It would give users directly what they want, a string, and skips the delayed formatting stuff which is not useful in this case. And maybe we can add try_to_string to DelayedFormat?


Also there are two cases that are currently not covered by the solution in this PR:

  • At the time a format string is created, by StrftimeItems::new, it doesn't yet know with which type it will be used for formatting. It may reasonably be used with multiple types. Formatting will return an error when the type does not contain the fields needed for the format string, such as time fields on NaiveDate. This would still panic in to_string. But that is fixable in this PR by adding more validation in the format methods.

  • (Contrived) A problem in the format string may not be the only reason formatting can fail, and panic in to_string. For example the RFC2822 formatting item returns an error if the year is negative or more than 4 digits, because that is not supported by that standard. But admittedly this is the only case I found.

@pitdicker pitdicker mentioned this pull request Jun 3, 2023
@pitdicker
Copy link
Collaborator

pitdicker commented Jun 4, 2023

Found some interesting history.

In 2017 ToString::to_string was changed to panic on errors (issue, PR).

The documentation for Display says, slightly hidden in the fmt module:

Additionally, the return value of this function is fmt::Result which is a type alias of Result<(), std::fmt::Error>. Formatting implementations should ensure that they propagate errors from the Formatter (e.g., when calling write!). However, they should never return errors spuriously. That is, a formatting implementation must and may only return an error if the passed-in Formatter returns an error. This is because, contrary to what the function signature might suggest, string formatting is an infallible operation. This function only returns a result because writing to the underlying stream might fail and it must provide a way to propagate the fact that an error has occurred back up the stack.

So that the Display implementation of DelayedFormat can return an error is indeed wrong according to this.

@pitdicker
Copy link
Collaborator

My new thoughts on this PR (I hope I now finally have all the context...)

  • Validation should happen in the various format methods, as they know the available fields of the type.
  • StrftimeItems::new is not really the right place for the validation. Doing it here would require running the iterator (and parsing the formatting string) three times. I would keep it as it currently is.
  • Validation should check against:
    • Item::Error
    • Internal(InternalFixed { val: InternalInternal::TimezoneOffsetPermissive }), which currently panics when formatting.
    • Any fields that are not part of the supplied type.
  • The RFC2822 item is not available as a formatting specifier. Maybe we should remove it, and let users call the already existing explicit function to format to this. It is the only formatting specifier that can return an error.

@djc djc mentioned this pull request Jun 5, 2023
@djc
Copy link
Contributor

djc commented Jun 5, 2023

I feel like this ends up being a lot of pain while not getting us to the API we really want. Please have a look at #1127 and give feedback on the proposed design.

@pitdicker
Copy link
Collaborator

I feel like this ends up being a lot of pain while not getting us to the API we really want. Please have a look at #1127 and give feedback on the proposed design.

While only a first step towards a proper solution, we did get StrftimeItems::parse which can return an error if the format string is invalid in #1184.

Closing this PR. @jaggededgedjustice thanks for working on this though!

@pitdicker pitdicker closed this Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Parsing an string with %Z causes the library to panic
6 participants