Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve RegEx parser, reduce possibilities as the key for arbitrary properties #12121

Merged
merged 5 commits into from
Oct 2, 2023

Conversation

RobinMalfait
Copy link
Contributor

@RobinMalfait RobinMalfait commented Oct 2, 2023

This PR improves the RegEx parser, by handling a few more edge cases especially around minified code. In this case, the RegEx saw square brackets and considered it an arbitrary property which is incorrect.

This PR reduces the valid possibilities for the "key" of the arbitrary property, which in turn solves the bug.

Internal note: The Oxide parser already handled this correctly.

Fixes: #12109

Previous:
- Copy `results`, for every subsequent result of other `patterns`
- Loop over results to filter out `undefined` values
- Loop over results to map to `clipAtBalancedParens`

Current:
- For each candidate, push the `clipAtBalancedParens(candidate)` into
  the `results`

This way we are not copying existing results, and we are also avoiding
additional loops over the entire array to filter out `undefined` values
and map to `clipAtBalancedParens`.
```
[foo:bar]
 ─┬─
  └── This part cannot contain `]`
```

This is also a very targeted fix for when the arbitrary properties seem
to match a large piece of text, but shouldn't
@RobinMalfait RobinMalfait changed the title Improve RegEx parser, catch edge cases in minified code Improve RegEx parser, reduce possibilities as the key for arbitrary properties Oct 2, 2023
@RobinMalfait RobinMalfait merged commit 0af88b1 into master Oct 2, 2023
10 checks passed
@RobinMalfait RobinMalfait deleted the fix/issue-12109 branch October 2, 2023 14:19
thecrypticace pushed a commit that referenced this pull request Oct 23, 2023
…roperties (#12121)

* optimize handling of RegEx parser results

Previous:
- Copy `results`, for every subsequent result of other `patterns`
- Loop over results to filter out `undefined` values
- Loop over results to map to `clipAtBalancedParens`

Current:
- For each candidate, push the `clipAtBalancedParens(candidate)` into
  the `results`

This way we are not copying existing results, and we are also avoiding
additional loops over the entire array to filter out `undefined` values
and map to `clipAtBalancedParens`.

* do not allow `]` in the first part of arbitrary properties

```
[foo:bar]
 ─┬─
  └── This part cannot contain `]`
```

This is also a very targeted fix for when the arbitrary properties seem
to match a large piece of text, but shouldn't

* add real world tests for parsing candidate strings

* sync package-lock.json

* update changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Classes such as p-1.5 and p-2.5 are not detected and purged in minified javascript during build
1 participant