Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse error #200

Open
toots opened this issue Dec 3, 2021 · 2 comments
Open

Parse error #200

toots opened this issue Dec 3, 2021 · 2 comments

Comments

@toots
Copy link

toots commented Dec 3, 2021

This pcre generates a parse error:

Re.Pcre.regexp "#{([^}]*?)}"
Exception raised: Re__Perl.Parse_error
Raised at Re__Perl.parse.atom in file "lib/perl.ml", line 168, characters 40-57
Called from Re__Perl.parse.piece in file "lib/perl.ml", line 83, characters 12-19
Called from Re__Perl.parse.branch' in file "lib/perl.ml", line 81, characters 18-26
Called from Re__Perl.parse.regexp in file "lib/perl.ml" (inlined), line 75, characters 30-41
Called from Re__Perl.parse in file "lib/perl.ml", line 253, characters 12-21
Called from Re__Perl.re in file "lib/perl.ml", line 263, characters 4-142
Called from Re__Pcre.regexp in file "lib/pcre.ml" (inlined), line 23, characters 35-50

It seems perfectly valid for ocaml-pcre, node and perl's pcre.

@bcc32
Copy link
Contributor

bcc32 commented Dec 3, 2021

I don't have a particular opinion on how ocaml-re should handle this, but I got nerd-sniped so here's what I found:

FWIW, in sufficiently new perl (v5.22 and newer), this regexp produces a warning:

Unescaped left brace in regex is passed through in regex; marked by <-- HERE in m/#{ <-- HERE ([^}]*?)}/ at temp.pl line 4.

The curly braces normally indicate a repetition count, but no valid integer is found between them. Perl just treats them as literal characters in this case, but this usage is apparently deprecated (from man perlre):

(If a non-escaped curly bracket occurs in a context other than one of
the quantifiers listed above, where it does not form part of a
backslashed sequence like "\x{...}", it is either a fatal syntax
error, or treated as a regular character, generally with a deprecation
warning raised. To escape it, you can precede it with a backslash
("{") or enclose it within square brackets ("[{]"). This change will
allow for future syntax extensions (like making the lower bound of a
quantifier optional), and better error checking of quantifiers).

@toots
Copy link
Author

toots commented Dec 3, 2021

Thanks that's very informative. I think it'd be nice is the library can be permissive with its entries to allow for seamless migration from other pcre stack, while perhaps displaying a warning for instance.

I have another similar one that I'm about to submit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants