Skip to content

Can regex_syntax support the parser that only parses ASCII character patterns? #1177

Answered by BurntSushi
ybbh asked this question in Q&A
Discussion options

You must be logged in to vote

There is no ascii mode. The error you're getting occurs because, when Unicode mode is disabled, [^\n] matches any byte except for \n. This includes bytes like \xFF, which are neither ASCII nor valid UTF-8.

An "ASCII" mode is really just a subset of Unicode mode. So it's best to leave Unicode mode enabled, but change your pattern to match ASCII exclusively. For example, to match the set of ASCII codepoints that aren't \n, you could write [\p{ascii}&&[^\n]].

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@ybbh
Comment options

Answer selected by ybbh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants