Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add split_inclusive #917

Closed
wants to merge 2 commits into from
Closed

Add split_inclusive #917

wants to merge 2 commits into from

Conversation

archer884
Copy link

This adds a new iterator type and a new method to both the byte and unicode version of Regex. Most of the code is copied from the ordinary split iterator. My doc example is probably terrible, but who doesn't like fruit?

Before opening this PR, I checked to see whether it's possible to mimic this behavior using, say, a non-capturing group with the regular split. If it is, I didn't manage to do it correctly, so maybe this is still useful to someone. No idea. In my own work, where I have wanted this, I have also wanted the captures associated with the pattern, so that might be a more useful contract than what is presented here.

Closes #681

src/re_unicode.rs Outdated Show resolved Hide resolved
src/re_unicode.rs Outdated Show resolved Hide resolved
@archer884
Copy link
Author

Apologies; I didn't realize I had to run a different command to run the doc tests. I'm fixing those now.

@BurntSushi
Copy link
Member

cargo test should run them automatically.

@archer884
Copy link
Author

cargo test should run them automatically.

That's what I'd have thought. It's possible there are just so many tests that I didn't see the failures as they scrolled past. I was expecting to see them reported at the end and might just not have been vigilant enough. Anyway, I looked a lot more closely this last time. :p

@archer884
Copy link
Author

I've attempted to correct the behavior of the iterator to match the behavior expressed by split_inclusive in std. I've also attempted to make the docs less awful.

This changeset adjusts the documentation to more closely match that
found in std for split_inclusive. It also changes the behavior such that
the matched substring appears at the end of each element as a terminator
rather than at the head of each element.

Sorry; I never actually *read* the split_inclusive docs in std.
@BurntSushi
Copy link
Member

Why was this closed?

@archer884
Copy link
Author

Well, there was no response for a week, and I don't like leaving things in limbo.

@BurntSushi
Copy link
Member

Fair enough. It would help communicating your expectations up front next time, and I would be able to tell you right away that I won't be able to live up to them.

@archer884

This comment was marked as off-topic.

@shner-elmo
Copy link

I'm also one of the many users struggling with importing regex patterns from Python to Rust,
after reading #285, #330, and #681, I was really hoping for this feature to be added.

If anybody is still interested in implementing it, let me know, I would love to work on it!

@BurntSushi
Copy link
Member

@shner-elmo We should keep discussion about this on the actual issue #681 (comment). Not on closed PRs.

@rust-lang rust-lang locked as resolved and limited conversation to collaborators Oct 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce split_inclusive to API
3 participants