fuzz_regex_match was not rolled back before merge #1037

addisoncrump · 2023-07-12T13:36:51Z

We previously discussed how changing the input shape may affect the usefulness of existing inputs discovered by clusterfuzz and I think this still holds true. Can we (once again) revert fuzz_regex_match? Hopefully this hasn't caused Clusterfuzz to cleanse the existing corpus too much.

BurntSushi · 2023-07-12T13:42:06Z

Hmmm right. So I think the issue I have with this is that it effectively puts a ceiling on how much each fuzz target can change. That seems pretty sub-optimal to me. How much does re-using the existing corpus really matter? Is it just about confirming that old regressions don't re-appear? If so, I'm usually pretty good about adding separate regression tests to the test suite, so I'm not too worried about that.

addisoncrump · 2023-07-13T07:53:05Z

Okay, to address these individually:

Reusing the existing corpus allows the fuzzer to have a starting point for mutating the inputs (since fuzzers work by making changes to existing inputs from the corpus). For Clusterfuzz, this tends to represent CPU-years worth of computation, and is normally a quite effective representation of the potential input space of the program. The corpus is strictly not a regression test suite -- it is composed of inputs which explore the state space of the program well so that mutant inputs are meaningful.
I wouldn't worry about restricting fuzzers here -- you may have a corpus for each fuzzer in Clusterfuzz, so it is not an issue to have different fuzzers with different input shapes. What I recommend is having one fuzzer with the old pattern (so that you can reuse the exploration of the older version of regex) and adding new fuzzers with new patterns.
Restricting yourself to Arbitrary-style fuzzers may prevent some inputs from existing, as we discussed previously. I think in this case it's not too much of a concern, but I do worry about the effect of mixing config bytes with bytes that represent regexes.

Paging @jonathanmetzman here -- I'm not sure what the eviction policy for corpora on Clusterfuzz/OSS-Fuzz is.

BurntSushi · 2023-07-13T12:54:26Z

Yeah to be honest this is all pretty opaque to me and it seems like something that really just isn't practical. There is no real check against changing the fuzzer in a way that makes previous inputs bunk. Just expecting the developers "to know" which change will and won't mess things up, and expecting the fuzzer to never evolve just does not seem practical to me.The costs here aren't totally clear either. Like, I understand what you're saying conceptually, but it just doesn't seem that big of a deal to me to upend the fuzz targets once in a while. But like I said, it's difficult to know concrete what one is missing out on.

addisoncrump · 2023-07-13T13:32:40Z

Yeah, I agree that this is not clearly communicated. Fuzzers sadly are often treated as black boxes, but it's difficult to communicate the (often extremely subtle) potential footguns well. They are not as magic as we'd like them to be 😅

If this is the preferred input shape going forward, then it's probably fine to let the corpus rebuild, but you may not get any results for a while -- it may take some time for the fuzzer to explore again. I recommend avoiding changing this often. I'm sorry I didn't communicate the significance of this better earlier.

Regarding the fuzzer evolving: normally, we don't observe the input shape changing, just the target (e.g. adding an assertion, maybe adding a feature, etc.). This is a pretty unique and atypical case, which is perhaps one of the reasons it is not so well documented. Going forward, though, it's possible to add additional fuzzers to OSS-Fuzz, so evolution should potentially be new fuzzers if it affects the input shape, or a dramatically different test.

I forgot to do this step, and as a result, OSS-fuzz hasn't been running any of the new fuzzers. Hopefully this is enough. Ref #1037

BurntSushi · 2023-07-13T13:46:17Z

Gotya.

And I just realized that probably none of the new fuzzers are running because this script needs to be updated and hasn't been yet:

regex/fuzz/oss-fuzz-build.sh

Line 4 in 48daadc

cp fuzz/target/x86_64-unknown-linux-gnu/release/fuzz_regex_match $OUT/

I put up a PR to fix this here: #1042

BurntSushi · 2023-07-13T13:46:46Z

(I thought it was amazing they hadn't found anything yet too. Drats.)

I forgot to do this step, and as a result, OSS-fuzz hasn't been running any of the new fuzzers. Hopefully this is enough. Ref #1037

addisoncrump · 2023-07-13T15:07:38Z

Closing this as completed.

jonathanmetzman · 2023-07-14T20:23:57Z

I think once you change the format, corpus minimization later that day will trash (all?) of your previous corpus if it no longer produces the same behavior.

addisoncrump · 2023-07-15T13:03:22Z

I think once you change the format, corpus minimization later that day will trash (all?) of your previous corpus if it no longer produces the same behavior.

Good to know. Would it be possible to document this behaviour somewhere? I could only find docs for testcase minimisation.

BurntSushi added the question label Jul 12, 2023

BurntSushi added a commit that referenced this issue Jul 13, 2023

fuzz: add all fuzzers to OSS-fuzz

0560985

I forgot to do this step, and as a result, OSS-fuzz hasn't been running any of the new fuzzers. Hopefully this is enough. Ref #1037

BurntSushi mentioned this issue Jul 13, 2023

fuzz: add all fuzzers to OSS-fuzz #1042

Merged

BurntSushi added a commit that referenced this issue Jul 13, 2023

fuzz: add all fuzzers to OSS-fuzz

855c5c4

I forgot to do this step, and as a result, OSS-fuzz hasn't been running any of the new fuzzers. Hopefully this is enough. Ref #1037

addisoncrump closed this as completed Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fuzz_regex_match was not rolled back before merge #1037

fuzz_regex_match was not rolled back before merge #1037

addisoncrump commented Jul 12, 2023

BurntSushi commented Jul 12, 2023

addisoncrump commented Jul 13, 2023

BurntSushi commented Jul 13, 2023

addisoncrump commented Jul 13, 2023

BurntSushi commented Jul 13, 2023

BurntSushi commented Jul 13, 2023

addisoncrump commented Jul 13, 2023

jonathanmetzman commented Jul 14, 2023

addisoncrump commented Jul 15, 2023

fuzz_regex_match was not rolled back before merge #1037

fuzz_regex_match was not rolled back before merge #1037

Comments

addisoncrump commented Jul 12, 2023

BurntSushi commented Jul 12, 2023

addisoncrump commented Jul 13, 2023

BurntSushi commented Jul 13, 2023

addisoncrump commented Jul 13, 2023

BurntSushi commented Jul 13, 2023

BurntSushi commented Jul 13, 2023

addisoncrump commented Jul 13, 2023

jonathanmetzman commented Jul 14, 2023

addisoncrump commented Jul 15, 2023