Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document order of returned matches #137

Open
mcpower opened this issue Dec 6, 2023 · 1 comment
Open

Document order of returned matches #137

mcpower opened this issue Dec 6, 2023 · 1 comment

Comments

@mcpower
Copy link

mcpower commented Dec 6, 2023

The order of the returned matches from AhoCorasick methods is not explicitly documented. While the order can be inferred from the examples given, it is not fully clear for overlapping matches.

The overlapping example shows that the .end() of matches is strictly non-decreasing, but it does not show the behaviour when two matches end at the same index. I assume there are a few possibilities for what happens in this case:

  • longest match (earliest start) first
  • shortest match (latest start) first
  • earliest PatternID first
  • unspecified / random order

It would be helpful to document what happens in this case.

@BurntSushi
Copy link
Owner

Yeah I agree the docs could be improved here.

Even at the lowest layers of the public API, the order is specifically not mentioned. I'm wondering whether I did that intentionally.

(I believe the actual behavior is earliest PatternID first.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants