Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add take_while_inclusive method to Iterator #142

Open
junbl opened this issue Nov 29, 2022 · 7 comments
Open

Add take_while_inclusive method to Iterator #142

junbl opened this issue Nov 29, 2022 · 7 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@junbl
Copy link

junbl commented Nov 29, 2022

Proposal

Problem statement

The existing take_while method on Iterator is useful for getting elements that satisfy a predicate. However, it has the notable drawback of removing the first element that doesn't satisfy the predicate, since it must take that element from the iterator to determine if the predicate is finished.

This proposal adds a method take_while_inclusive that returns elements that satisfy a predicate, but also including that first element that didn't satisfy said predicate. The rest of the iterator is left unchanged and thus no elements are lost.

Motivation, use-cases

Here's the use case in which I wanted this method (paraphrasing since it's closed source). I was parsing a list of parameters that can contain in itself a list of values, so I needed to conditionally take more values from an iterator to create that element.

In this case, since the first element not satisfying the predicate represents a delimiter, it should be included with the other elements. In other cases, you might need to split that last element off to add it back into the collection.

const LIST_START: &str = "[";
const LIST_END: &str = "]";    
let s = "a,b,c,[d,e,f],g";
let mut items = s.split(',');
let mut collected = Vec::new();
while let Some(item) = items.next() {
    if item.starts_with(LIST_START) {
        // item is the start of a list, so collect all the remaining nested list elements together
        let list = std::iter::once(item)
            .chain((&mut items).take_while_inclusive(|s| !s.ends_with(LIST_END)))
            .collect();
        collected.push(list);
    } else {
        collected.push(item.to_string());
    }
}
assert_eq!(collected, ["a", "b", "c", "[def]", "g"].map(ToString::to_string));

Also see the first Rust issue in which this was discussed for an example of a use case involving histogram bins: rust-lang/rust#62208

Playground link to above

Solution sketches

In the absence of this method, I initially used the take_while_ref method from itertools. However, this clones the first element that doesn't satisfy the predicate, which is less than ideal in my use case and impossible in others. My current code uses peeking_take_while and manually adds that next element back.

Links and related work

My PR to itertools implementing said method: rust-itertools/itertools#616

First discussed addition to std in rust-lang/rust#62208

Issue for addition to itertools: rust-itertools/itertools#597

The PR to itertools has been put on hold before investigating if this is an appropriate addition to std (to prevent another instance of collision like intersperse).

@junbl junbl added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Nov 29, 2022
@scottmcm
Copy link
Member

If take_while_inclusive is added, I think a map_while_inclusive would make sense to include along with it. Not sure exactly what type that should use, though -- ControlFlow could work, as would Result, or it could be made to work with any Try type.

@the8472
Copy link
Member

the8472 commented Nov 30, 2022

Shouldn't we let it bake in itertools first before considering it for std? It doesn't seem immediately obvious to me how much code using for-in loops code be replaced with this, i.e. how useful it is in general.

The PR to itertools has been put on hold before investigating if this is an appropriate addition to std (to prevent another instance of collision like intersperse).

The plan is to fix the underlying method resolution issues so we don't have to coordinate between crates anymore.

@junbl
Copy link
Author

junbl commented Dec 1, 2022

The plan is to fix the underlying method resolution issues so we don't have to coordinate between crates anymore.

Assuming you're referring to this guy. Doesn't look like a lot of progress has been made on the impl, but if we're okay with it blocking this kind of addition, I'm happy to drop this in favor of the itertools pr and maybe revisit when that's finished.

@jswrenn what do you think?

@Xiretza
Copy link

Xiretza commented Dec 8, 2022

If I may add my opinion, I would expect this method to be called take_until, it's the term that I searched for when trying to find a method that does this. Additionally, take_while_inclusive just doesn't make any sense to me - if anything, it would have to be called take_while_and_one_more, the extra element at the end is explicitly not covered by the "while". take_until_inclusive would also make more sense.

@jswrenn
Copy link
Member

jswrenn commented Dec 8, 2022

Shouldn't we let it bake in itertools first before considering it for std?

For the past few years, Itertools has had a policy (noted in its README) of not accepting additions which could also be plausibly made to Iterator. In practical terms, that means we only accept adapters that allocate (since Iterator is going to be stuck in core for the foreseeable future), unless we get a clear signal from the libs team that the addition is very unlikely to be included in the standard library. We'll reconsider that policy once there's been progress on the plan to fix method resolution.

I personally think this particular adapter is quite useful (particularly for parsing), and expect that it'll eventually make its way into the standard library. I don't want the presence of this adapter in Itertools to prevent its addition to Iterator.

@pitaj
Copy link

pitaj commented Jan 23, 2023

What about adding a take_until kind of function to Peekable?

@junbl
Copy link
Author

junbl commented Apr 7, 2023

What about adding a take_until kind of function to Peekable?

Exists in itertools, but afaict no traffic as yet on adding to std.

https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.peeking_take_while

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

6 participants