-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid back reference error #112
Comments
We try to be compatible with Oniguruma syntax, which supports octal, so this is something we might add. How does Perl disambiguate between |
I've never used Perl so I can't say for sure but this is how I think it works:
I've just found another valid Perl regular expression that can't be parsed by /[({[<][. ]*(?-i:\xbc\xba[. ]*\xc0\xce[. ]*)?(?-i:\xb1\xa4(?:[. ]*|[\x00-\x7f]{0,3})\xb0\xed|\xc1\xa4[. ]*\xba\xb8|\xc8\xab[. ]*\xba\xb8)[. ]*[)}\]>]/ I had to escape every special character so it could be parsed: /[\(\{\[\<][. ]*(?-i:\xbc\xba[. ]*\xc0\xce[. ]*)?(?-i:\xb1\xa4(?:[. ]*|[\x00-\x7f]{0,3})\xb0\xed|\xc1\xa4[. ]*\xba\xb8|\xc8\xab[. ]*\xba\xb8)[. ]*[\)\}\]\>]/ |
I'm not involved with fancy-regex (but I'm the author of Maybe a case here and there can be smoothed out, but in general, if you need to be able to "parse and match regexes written for Perl," then I think you have three choices:
(This same discussion has repeated itself several times in different forms on the regex crate repo.) |
Looks like onig doesn't require a leading 0, need to check what it does when it's ambiguous: https://github.com/kkos/oniguruma/blob/master/doc/SYNTAX.md#28-onig_syn_op_esc_octal3-enable-ooo-octal-codes |
I am currently porting SpamAssassin to Rust, which relies on hundreds of Perl regular expressions (many of them very inefficient) so my plan is to replace with native code those regexes that don't work on I have already fixed all the expressions that couldn't be parsed, I just opened this issue in case the author(s) wanted to support a syntax that Perl and other engines consider valid. |
Hi,
I am writing a tool that needs to be able parse and evaluate regular expressions originally written for Perl. Overall the library works great but I am getting an
Invalid back reference
error when trying to parse the following regex:This regex is parsed properly by Perl and also online tools such as regex101.com. To make it work on
fancy-regex
I need to replace the octal references with their corresponding Unicode sequences:Not a big deal, but I am opening this issue in case you consider not being able to parse those octal codes a bug in the library.
Thanks.
The text was updated successfully, but these errors were encountered: