Skip to content

Commit

Permalink
globset: use non-capture groups in regex transform
Browse files Browse the repository at this point in the history
We currently implement globs by converting them to regexes, and in doing
so, sometimes use grouping. In all but one case, we used non-capturing
groups. But for alternations, we used capturing groups, which was likely
just an oversight. We don't make use of capture groups at all, and while
they usually don't have any overhead, they lead to weird cases like this
one: rust-lang/regex#1059

That particular issue is also a bug in the regex crate itself, which is
fixed in rust-lang/regex#1062. Note though that
the bug fix in the regex crate is required. Even with this patch to
globset, memory usage is reduced (by about half in rust-lang/regex#1059)
but is not returned to where it was prior to the regex 1.9 release.
  • Loading branch information
BurntSushi committed Aug 5, 2023
1 parent 341a19e commit 7227e94
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion crates/globset/src/glob.rs
Expand Up @@ -736,7 +736,7 @@ impl Tokens {
// It is possible to have an empty set in which case the
// resulting alternation '()' would be an error.
if !parts.is_empty() {
re.push('(');
re.push_str("(?:");
re.push_str(&parts.join("|"));
re.push(')');
}
Expand Down Expand Up @@ -1276,6 +1276,7 @@ mod tests {
toregex!(re32, "/a**", r"^/a.*.*$");
toregex!(re33, "/**a", r"^/.*.*a$");
toregex!(re34, "/a**b", r"^/a.*.*b$");
toregex!(re35, "{a,b}", r"^(?:b|a)$");

matches!(match1, "a", "a");
matches!(match2, "a*b", "a_b");
Expand Down

0 comments on commit 7227e94

Please sign in to comment.