Add a new CI job to fuzz the parser #11089

AlexWaygood · 2024-04-22T12:27:26Z

Summary

This PR adds a new CI job that tests the parser by running the new scripts/fuzz-parser script. The job runs the parser over 1,000 randomly generated (but all syntactically valid) Python source files. For any files that the parser fails to parse, the script minimises the generated source code into a minimal reproducible example. On completion of the script, if any bugs were encountered, the script fails with exit code 1.

How many seeds should we pass to the script?

This PR currently passes 1,001 seeds to the script for it to test. In an ideal world, we'd obviously run it over a larger range of seeds, but I think 1,001 seeds is enough for it to serve as a useful regression test (I discovered the bug that was fixed in #11009 on seed 6 with an early version of the fuzz-parser script).

The new job takes around ten minutes to run in CI. That's quicker than the ecosystem job if the ecosystem job runs both the formatter and linter ecosystem checks, but it's slower than the ecosystem job if that job runs only the linter ecosystem checks. If we think 10 minutes is too long for this job, we could pass a smaller range of seeds to the script in CI.

One way we could get the "best of both worlds" might be to pass it a small range of seeds for CI on PRs (500?) but have a nightly job that passes a large range of seeds (5,000? 10,000?). We could have a bot that automatically opens an issue when the nightly job fails -- we do something like this at typeshed: python/typeshed#11801

github-actions · 2024-04-22T12:44:37Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

dhruvmanila

Thank you! I'm not exactly sure about having this in CI, but I'm fine adding it for now. For now, I just run the fuzzer locally with random seed numbers for around 2-3k inputs. Do you know how this affects the CI minutes?

The new job takes around ten minutes to run in CI. That's quicker than the ecosystem job if the ecosystem job runs both the formatter and linter ecosystem checks, but it's slower than the ecosystem job if that job runs only the linter ecosystem checks. If we think 10 minutes is too long for this job, we could pass a smaller range of seeds to the script in CI.

I think we can start with 500 for now if it's possible to add a random seed number for every invocation of the CI.

dhruvmanila · 2024-04-23T08:20:31Z

.github/workflows/ci.yaml

+          # Make executable, since artifact download doesn't preserve this
+          chmod +x ${{ steps.download-cached-binary.outputs.download-path }}/ruff
+
+          python scripts/fuzz-parser/fuzz.py 0-1_000 --test-executable ${{ steps.download-cached-binary.outputs.download-path }}/ruff


Does having the same seed value mean that we'll be fuzzing over the same randomly generated source code? It seems like so:

In [7]: for i in range(100): ...: a = generate(i) ...: b = generate(i) ...: print(f"{i}: {a == b}") ...: 0: True 1: True 2: True 3: True 4: True 5: True 6: True 7: True 8: True 9: True 10: True 11: True 12: True 13: True 14: True 15: True ...

Do you think it would be useful to have a randomly generated seed value in here? I don't know how easy it is to add something like that to GitHub Actions.

Yup, the script is deterministic — the same seed always generates the same random source file.

I think it depends on whether we think the purpose of running it in CI is primarily to catch regressions or to find previously unknown bugs. If the purpose is to catch regressions, we should run it over the same set of seeds on each PR (because otherwise we won't know that it's a new bug without doing some investigation). If the purpose is to find unknown bugs (new or pre-existing), then maybe a nightly workflow running it over a large set of randomly selected seeds would be better. If both, then maybe we should do both 😛

Do you think it would be useful to have a randomly generated seed value in here? I don't know how easy it is to add something like that to GitHub Actions.

If we want to do this, my instinct would be to just add a CLI option --random-seeds or something, and then use Python's random module to generate the seeds from inside the script

Oh, that's a good point. I think it's fine to run it on the same set of seeds and I can provide some context in the contributing docs regarding this. Thanks for the context.

AlexWaygood · 2024-04-23T10:17:12Z

I've reduced the number of seeds in CI to 501. I'll land this now as a regression test to run on PRs, and I'll separately explore running a larger, random set of seeds as a nightly workflow

AlexWaygood · 2024-04-23T10:28:26Z

For now, I just run the fuzzer locally with random seed numbers for around 2-3k inputs. Do you know how this affects the CI minutes?

It seems like in CI (which uses a debug build and runs on machines with only a few cores, so seems to be a fair bit slower than running the script locally), it reliably fuzzes around 100 seeds a minute. So passing 500 seeds means the script takes 5 minutes; passing 1,000 seeds means the script takes 10 minutes; etc.

Add a new CI job to fuzz the parser

22fecb2

AlexWaygood requested review from MichaReiser and dhruvmanila April 22, 2024 12:27

MichaReiser added the ci Related to internal CI tooling label Apr 23, 2024

MichaReiser approved these changes Apr 23, 2024

View reviewed changes

dhruvmanila approved these changes Apr 23, 2024

View reviewed changes

Reduce number of seeds to 501

1148f9b

AlexWaygood enabled auto-merge (squash) April 23, 2024 10:17

AlexWaygood merged commit f5c7a62 into astral-sh:main Apr 23, 2024
18 checks passed

AlexWaygood deleted the fuzz-ci branch April 23, 2024 10:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new CI job to fuzz the parser #11089

Add a new CI job to fuzz the parser #11089

AlexWaygood commented Apr 22, 2024

github-actions bot commented Apr 22, 2024 •

edited

dhruvmanila left a comment

dhruvmanila Apr 23, 2024

AlexWaygood Apr 23, 2024 •

edited

AlexWaygood Apr 23, 2024

dhruvmanila Apr 23, 2024

AlexWaygood commented Apr 23, 2024 •

edited

AlexWaygood commented Apr 23, 2024 •

edited

Add a new CI job to fuzz the parser #11089

Add a new CI job to fuzz the parser #11089

Conversation

AlexWaygood commented Apr 22, 2024

Summary

How many seeds should we pass to the script?

github-actions bot commented Apr 22, 2024 • edited

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

dhruvmanila left a comment

Choose a reason for hiding this comment

dhruvmanila Apr 23, 2024

Choose a reason for hiding this comment

AlexWaygood Apr 23, 2024 • edited

Choose a reason for hiding this comment

AlexWaygood Apr 23, 2024

Choose a reason for hiding this comment

dhruvmanila Apr 23, 2024

Choose a reason for hiding this comment

AlexWaygood commented Apr 23, 2024 • edited

AlexWaygood commented Apr 23, 2024 • edited

github-actions bot commented Apr 22, 2024 •

edited

`ruff-ecosystem` results

AlexWaygood Apr 23, 2024 •

edited

AlexWaygood commented Apr 23, 2024 •

edited

AlexWaygood commented Apr 23, 2024 •

edited