Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to regex-cursor #9422

Merged
merged 1 commit into from Feb 26, 2024
Merged

Switch to regex-cursor #9422

merged 1 commit into from Feb 26, 2024

Conversation

pascalkuthe
Copy link
Member

@pascalkuthe pascalkuthe commented Jan 25, 2024

Closes #185

Replaces our use of regex with regex-cursor, so we don't have to convert ropes to string. That implementation passes the regex testsuite so it should be stable but definitely needs testing.

See discussion at rust-lang/regex#425.

I used the chance to replace split_at_newline with something that uses ropey so that unicode feature flags are correctly respected (should also be slightly faster).

@pascalkuthe pascalkuthe added A-dependencies Area: Dependency S-waiting-on-review Status: Awaiting review from a maintainer. S-needs-testing Status: Needs to be tested out in order to discover potential bugs. labels Jan 25, 2024
@kirawi
Copy link
Member

kirawi commented Jan 31, 2024

Crashed with (not sure what I did)

thread 'main' panicked at /home/sumire/.cargo/registry/src/index.crates.io-6f17d22bba15001f/regex-cursor-0.1.2/src/input.rs:204:13:
internal error: entered unreachable code: cursor does not support backtracking 506
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: regex_cursor::engines::dfa::search::find_fwd
   3: regex_cursor::engines::dfa::try_search
   4: regex_cursor::engines::meta::strategy::StrategyI::search
   5: regex_cursor::engines::meta::regex::Regex::search
   6: helix_term::commands::search_impl
   7: helix_term::commands::search_next_or_prev_impl
   8: helix_term::ui::editor::EditorView::handle_keymap_event::{{closure}}
   9: helix_term::ui::editor::EditorView::handle_keymap_event
  10: <helix_term::ui::editor::EditorView as helix_term::compositor::Component>::handle_event
  11: helix_term::compositor::Compositor::handle_event
  12: hx::main_impl::{{closure}}
  13: tokio::runtime::park::CachedParkThread::block_on
  14: tokio::runtime::context::runtime::enter_runtime
  15: tokio::runtime::runtime::Runtime::block_on
  16: hx::main

@pascalkuthe
Copy link
Member Author

pascalkuthe commented Jan 31, 2024

Crashed with (not sure what I did)

thread 'main' panicked at /home/sumire/.cargo/registry/src/index.crates.io-6f17d22bba15001f/regex-cursor-0.1.2/src/input.rs:204:13:
internal error: entered unreachable code: cursor does not support backtracking 506
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: regex_cursor::engines::dfa::search::find_fwd
   3: regex_cursor::engines::dfa::try_search
   4: regex_cursor::engines::meta::strategy::StrategyI::search
   5: regex_cursor::engines::meta::regex::Regex::search
   6: helix_term::commands::search_impl
   7: helix_term::commands::search_next_or_prev_impl
   8: helix_term::ui::editor::EditorView::handle_keymap_event::{{closure}}
   9: helix_term::ui::editor::EditorView::handle_keymap_event
  10: <helix_term::ui::editor::EditorView as helix_term::compositor::Component>::handle_event
  11: helix_term::compositor::Compositor::handle_event
  12: hx::main_impl::{{closure}}
  13: tokio::runtime::park::CachedParkThread::block_on
  14: tokio::runtime::context::runtime::enter_runtime
  15: tokio::runtime::runtime::Runtime::block_on
  16: hx::main

Thats a stranfe one that shouldn't happen but it's very hard to track down without reproduction case (the backtrack is too stripped because it's a release build)

@kirawi
Copy link
Member

kirawi commented Jan 31, 2024

Ah, it's hx then /git nix. This is on a personal fork with multiple PRs pulled in (I forgot to mention) but I don't think any of them touch regex.

Edit: I can reproduce with this branch alone

@pascalkuthe
Copy link
Member Author

oh yeah I can reproduce that is very odd. Is suspect it has do do with the special case of an empty file

@kirawi
Copy link
Member

kirawi commented Jan 31, 2024

I don't think it's a special case with an empty file. If you delete the entire buffer and then do /git n it won't crash. If you add a new line and then 1G, it won't crash. If you add a new line without moving the cursor, it crashes. Maybe it has something to do with the cursor being on the line before the EOF newline?

@pascalkuthe
Copy link
Member Author

It was an edgecase when you search for stuff at a one-past-the-end position which is what I suspected. Adapting the ropey cursor semantics to the one regex-cusor uses isn't trivial. I published a new version where that is fixed

@kirawi kirawi added the C-perf label Feb 12, 2024
@archseer archseer added this to the next milestone Feb 25, 2024
Copy link
Member

@the-mikedavis the-mikedavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using this for a while in my driver branch and I haven't noticed any more bugs. It's also really nice to build on top of - for example matching nodes given by tree-sitter with text.regex_input_at_bytes(node.start_byte()..node.end_byte())

@gabydd
Copy link
Member

gabydd commented Feb 25, 2024

I also haven't hit any bugs with this anymore

Copy link
Member

@archseer archseer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work as always! Let's get this to soak in master for a bit

@archseer archseer merged commit cd02976 into master Feb 26, 2024
6 checks passed
@archseer archseer deleted the regex-cursor branch February 26, 2024 07:45
shortc pushed a commit to shortc/helix that referenced this pull request Mar 31, 2024
postsolar pushed a commit to postsolar/helix that referenced this pull request Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dependencies Area: Dependency C-perf S-needs-testing Status: Needs to be tested out in order to discover potential bugs. S-waiting-on-review Status: Awaiting review from a maintainer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regex Automata
5 participants