You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -510,32 +554,34 @@ function checkExponentialBacktracking(path, pattern, ast) {
510
554
return;
511
555
}
512
556
513
-
constalternatives=node.alternatives;
514
-
515
-
consttotal=toNFA(alternatives[0]);
516
-
total.withoutEmptyWord();
517
-
for(leti=1,l=alternatives.length;i<l;i++){
518
-
consta=alternatives[i];
519
-
constcurrent=toNFA(a);
520
-
current.withoutEmptyWord();
521
-
522
-
if(!total.isDisjointWith(current)){
523
-
assert.fail(`${path}: The alternative \`${a.raw}\` is not disjoint with at least one previous alternative.`
524
-
+` This will cause exponential backtracking.`
525
-
+`\n\nTo fix this issue, you have to rewrite the ${node.type} \`${node.raw}\`.`
526
-
+` The goal is that all of its alternatives are disjoint.`
527
-
+` This means that if a (sub-)string is matched by the ${node.type}, then only one of its alternatives can match the (sub-)string.`
528
-
+`\n\nExample: \`(?:[ab]|\\w|::)+\``
529
-
+`\nThe alternatives of the group are not disjoint because the string "a" can be matched by both \`[ab]\` and \`\\w\`.`
530
-
+` In this example, the pattern can easily be fixed because the \`[ab]\` is a subset of the \`\\w\`, so its enough to remove the \`[ab]\` alternative to get \`(?:\\w|::)+\` as the fixed pattern.`
531
-
+`\nIn the real world, patterns can be a lot harder to fix.`
532
-
+` If you are trying to make the tests pass for a pull request but can\'t fix the issue yourself, then make the pull request (or commit) anyway.`
533
-
+` A maintainer will help you.`
534
-
+`\n\nFull pattern:\n${pattern}`);
535
-
}elseif(i!==l-1){
536
-
total.union(current);
557
+
withResultCache('disjointAlternatives',node,()=>{
558
+
constalternatives=node.alternatives;
559
+
560
+
consttotal=toNFA(alternatives[0]);
561
+
total.withoutEmptyWord();
562
+
for(leti=1,l=alternatives.length;i<l;i++){
563
+
consta=alternatives[i];
564
+
constcurrent=toNFA(a);
565
+
current.withoutEmptyWord();
566
+
567
+
if(!isDisjointWith(total,current)){
568
+
assert.fail(`${path}: The alternative \`${a.raw}\` is not disjoint with at least one previous alternative.`
569
+
+` This will cause exponential backtracking.`
570
+
+`\n\nTo fix this issue, you have to rewrite the ${node.type} \`${node.raw}\`.`
571
+
+` The goal is that all of its alternatives are disjoint.`
572
+
+` This means that if a (sub-)string is matched by the ${node.type}, then only one of its alternatives can match the (sub-)string.`
573
+
+`\n\nExample: \`(?:[ab]|\\w|::)+\``
574
+
+`\nThe alternatives of the group are not disjoint because the string "a" can be matched by both \`[ab]\` and \`\\w\`.`
575
+
+` In this example, the pattern can easily be fixed because the \`[ab]\` is a subset of the \`\\w\`, so its enough to remove the \`[ab]\` alternative to get \`(?:\\w|::)+\` as the fixed pattern.`
576
+
+`\nIn the real world, patterns can be a lot harder to fix.`
577
+
+` If you are trying to make the tests pass for a pull request but can\'t fix the issue yourself, then make the pull request (or commit) anyway.`
578
+
+` A maintainer will help you.`
579
+
+`\n\nFull pattern:\n${pattern}`);
580
+
}elseif(i!==l-1){
581
+
total.union(current);
582
+
}
537
583
}
538
-
}
584
+
});
539
585
}
540
586
541
587
visitRegExpAST(ast.pattern,{
@@ -555,49 +601,51 @@ function checkExponentialBacktracking(path, pattern, ast) {
555
601
return;// not a group
556
602
}
557
603
558
-
// The idea here is the following:
559
-
//
560
-
// We have found a part `A*` of the regex (`A` is assumed to not accept the empty word). Let `I` be
561
-
// the intersection of `A` and `A{2,}`. If `I` is not empty, then there exists a non-empty word `w`
562
-
// that is accepted by both `A` and `A{2,}`. That means that there exists some `m>1` for which `w`
563
-
// is accepted by `A{m}`.
564
-
// This means that there are at least two ways `A*` can accept `w`. It can be accepted as `A` or as
565
-
// `A{m}`. Hence there are at least 2^n ways for `A*` to accept the word `w{n}`. This is the main
566
-
// requirement for exponential backtracking.
567
-
//
568
-
// This is actually only a crude approximation for the real analysis that would have to be done. We
569
-
// would actually have to check the intersection `A{p}` and `A{p+1,}` for all p>0. However, in most
assert.fail(`${path}: The quantifier \`${node.raw}\` ambiguous for all words ${JSON.stringify(example)}.repeat(n) for any n>1.`
581
-
+` This will cause exponential backtracking.`
582
-
+`\n\nTo fix this issue, you have to rewrite the element (let's call it E) of the quantifier.`
583
-
+` The goal is modify E such that it is disjoint with repetitions of itself.`
584
-
+` This means that if a (sub-)string is matched by E, then it must not be possible for E{2}, E{3}, E{4}, etc. to match that (sub-)string.`
585
-
+`\n\nExample 1: \`(?:\\w+|::)+\``
586
-
+`\nThe problem lies in \`\\w+\` because \`\\w+\` and \`(?:\\w+){2}\` are not disjoint as the string "aa" is fully matched by both.`
587
-
+` In this example, the pattern can easily be fixed by changing \`\\w+\` to \`\\w\`.`
588
-
+`\nExample 2: \`(?:\\w|Foo)+\``
589
-
+`\nThe problem lies in \`\\w\` and \`Foo\` because the string "Foo" can be matched as either repeating \`\\w\` 3 times or by using the \`Foo\` alternative once.`
590
-
+` In this example, the pattern can easily be fixed because the \`Foo\` alternative is redundant can can be removed.`
591
-
+`\nExample 3: \`(?:\\.\\w+(?:<.*?>)?)+\``
592
-
+`\nThe problem lies in \`<.*?>\`. The string ".a<>.a<>" can be matched as either \`\\. \\w < . . . . >\` or \`\\. \\w < > \\. \\w < >\`.`
593
-
+` When it comes to exponential backtracking, it doesn't matter whether a quantifier is greedy or lazy.`
594
-
+` This means that the lazy \`.*?\` can jump over \`>\`.`
595
-
+` In this example, the pattern can easily be fixed because we just have to prevent \`.*?\` jumping over \`>\`.`
596
-
+` This can done by replacing \`<.*?>\` with \`<[^\\r\\n>]*>\`.`
597
-
+`\n\nIn the real world, patterns can be a lot harder to fix.`
598
-
+` If you are trying to make this test pass for a pull request but can\'t fix the issue yourself, then make the pull request (or commit) anyway, a maintainer will help you.`
599
-
+`\n\nFull pattern:\n${pattern}`);
600
-
}
604
+
withResultCache('2star',node,()=>{
605
+
// The idea here is the following:
606
+
//
607
+
// We have found a part `A*` of the regex (`A` is assumed to not accept the empty word). Let `I` be
608
+
// the intersection of `A` and `A{2,}`. If `I` is not empty, then there exists a non-empty word `w`
609
+
// that is accepted by both `A` and `A{2,}`. That means that there exists some `m>1` for which `w`
610
+
// is accepted by `A{m}`.
611
+
// This means that there are at least two ways `A*` can accept `w`. It can be accepted as `A` or as
612
+
// `A{m}`. Hence there are at least 2^n ways for `A*` to accept the word `w{n}`. This is the main
613
+
// requirement for exponential backtracking.
614
+
//
615
+
// This is actually only a crude approximation for the real analysis that would have to be done. We
616
+
// would actually have to check the intersection `A{p}` and `A{p+1,}` for all p>0. However, in most
assert.fail(`${path}: The quantifier \`${node.raw}\` ambiguous for all words ${JSON.stringify(example)}.repeat(n) for any n>1.`
628
+
+` This will cause exponential backtracking.`
629
+
+`\n\nTo fix this issue, you have to rewrite the element (let's call it E) of the quantifier.`
630
+
+` The goal is modify E such that it is disjoint with repetitions of itself.`
631
+
+` This means that if a (sub-)string is matched by E, then it must not be possible for E{2}, E{3}, E{4}, etc. to match that (sub-)string.`
632
+
+`\n\nExample 1: \`(?:\\w+|::)+\``
633
+
+`\nThe problem lies in \`\\w+\` because \`\\w+\` and \`(?:\\w+){2}\` are not disjoint as the string "aa" is fully matched by both.`
634
+
+` In this example, the pattern can easily be fixed by changing \`\\w+\` to \`\\w\`.`
635
+
+`\nExample 2: \`(?:\\w|Foo)+\``
636
+
+`\nThe problem lies in \`\\w\` and \`Foo\` because the string "Foo" can be matched as either repeating \`\\w\` 3 times or by using the \`Foo\` alternative once.`
637
+
+` In this example, the pattern can easily be fixed because the \`Foo\` alternative is redundant can can be removed.`
638
+
+`\nExample 3: \`(?:\\.\\w+(?:<.*?>)?)+\``
639
+
+`\nThe problem lies in \`<.*?>\`. The string ".a<>.a<>" can be matched as either \`\\. \\w < . . . . >\` or \`\\. \\w < > \\. \\w < >\`.`
640
+
+` When it comes to exponential backtracking, it doesn't matter whether a quantifier is greedy or lazy.`
641
+
+` This means that the lazy \`.*?\` can jump over \`>\`.`
642
+
+` In this example, the pattern can easily be fixed because we just have to prevent \`.*?\` jumping over \`>\`.`
643
+
+` This can done by replacing \`<.*?>\` with \`<[^\\r\\n>]*>\`.`
644
+
+`\n\nIn the real world, patterns can be a lot harder to fix.`
645
+
+` If you are trying to make this test pass for a pull request but can\'t fix the issue yourself, then make the pull request (or commit) anyway, a maintainer will help you.`
0 commit comments