Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up --store support #138

Closed
imphil opened this issue Jan 2, 2024 · 11 comments
Closed

Speed up --store support #138

imphil opened this issue Jan 2, 2024 · 11 comments

Comments

@imphil
Copy link

imphil commented Jan 2, 2024

Thanks for providing this useful tool.

I looked into validating pyproject.toml with a cibuildwheel section. A schema for the cibuildwheel section is included in https://json.schemastore.org/pyproject.json (as of pypa/cibuildwheel#1622), but not in the JSON schema shipped with validate-pyproject. So I went to use the new --store command-line option in my pre-commit configuration like this:

- repo: https://github.com/abravalheri/validate-pyproject
  rev: main
  hooks:
    - id: validate-pyproject
      args:
        # Load an up-to-date JSON schema for pyproject.toml which references
        # the tool schemas used within pyproject.toml.
        - "--store"
        - "https://json.schemastore.org/pyproject.json"

This approach works beautifully on a technical level (i.e., the cibuildwheel tool section is validated correctly), but comes at a cost: running validate-pyproject now takes multiple seconds.

$ time validate-pyproject --store https://json.schemastore.org/pyproject.json pyproject.toml
Valid file: pyproject.toml

________________________________________________________
Executed in    8.15 secs    fish           external
   usr time    1.08 secs  931.00 micros    1.08 secs
   sys time    0.11 secs  116.00 micros    0.11 secs


$ time validate-pyproject pyproject.toml
Valid file: pyproject.toml

________________________________________________________
Executed in  148.49 millis    fish           external
   usr time  132.08 millis    1.26 millis  130.82 millis
   sys time   16.51 millis    0.16 millis   16.35 millis

I haven't done any profiling, but I would assume downloading the individual schemas at least contributes to the cost.

@henryiii, is this something you're seeing as well, and is there already a plan to address this (caching?)

@henryiii
Copy link
Collaborator

henryiii commented Jan 2, 2024

Yes, the main idea I've been working on is providing a regularly updated copy of the schema store files as a plugin. The problem with releasing this has been #134, since some schema (cibuildwheel, specifically) required fragment support. That's actually not true anymore, since cibuildwheel now uses a partial, so no current files require fragments. This solves caching as well as stability - a regularly updated Python package can be pinned as well.

The --store option could be done in parallel, too. I'd normally do this with async, and that is required for WebAssembly support, but we still support Python 3.6, so that's out.

@henryiii
Copy link
Collaborator

@abravalheri, I have pushed https://github.com/henryiii/validate-pyproject-schema-store if you'd like to take a look.

@abravalheri
Copy link
Owner

abravalheri commented Jan 10, 2024

Very nice @henryiii , thank you!

I wonder if it would make sense to add this in the all extras dependency of validate-pyproject itself (or another extra), or if that would cause cycles...

@henryiii
Copy link
Collaborator

I'm not sure about "all". A [schema-store] extra could be nice. We could be careful to avoid cycles, I think (validate-pyproject-schema-store only depends on validate-pyproject via an extra, too).

Where would you like validate-pyproject-schema-store to live, by the way? I have it in henryiii for now, but would be happy to move it now or eventually. If you'd like them both to live together here, for example, that would be fine.

@henryiii
Copy link
Collaborator

FYI, I've released validate-pyproject-schema-store, but it doesn't work currently, since the released version of validate-pyproject (0.15) doesn't support uint (Ruff).

@abravalheri
Copy link
Owner

Hi @henryiii, is it just a matter of cutting a new release?
We should be in a good shape for that. I will proceed and do that now.

@henryiii
Copy link
Collaborator

Awesome, thanks! It's working. See https://scientific-python.github.io/repo-review/ for an initial demo.

@henryiii
Copy link
Collaborator

Will see how the auto-update works after SchemaStore/schemastore#3548.

@henryiii
Copy link
Collaborator

henryiii commented Jan 24, 2024

Auto update works, merged and released it from my phone. New tool in this update, too, tool.pyright. :)

[schema-store] would make sense, not sure about all, as validate-pyproject-schema-store[all] requests validate-pyproject[all], not sure how that is handled when circular.

@henryiii
Copy link
Collaborator

This already caught a mistake in cookie's template: scientific-python/cookie#364 (comment) Nice!

@imphil
Copy link
Author

imphil commented Feb 5, 2024

Thanks @henryiii for the work on validate-pyproject-schema-store. With that we're down to around 1.5 seconds to validate our pyproject.toml file with pre-commit. Which isn't extremely fast, but fast enough for our use case. With that in mind, I'm happy to close this issue as resolved.

@imphil imphil closed this as completed Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants