Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable caching by default to help PyPI #54

Closed
hugovk opened this issue Sep 6, 2024 · 4 comments · Fixed by #193
Closed

Enable caching by default to help PyPI #54

hugovk opened this issue Sep 6, 2024 · 4 comments · Fixed by #193

Comments

@hugovk
Copy link

hugovk commented Sep 6, 2024

I see caching is disabled by default (except for self-hosted runners).

What are the reasons for not enabling by default?

Nearly 30% of PyPI's downloads come from CIs (via BigQuery). Given 47.1 billion total downloads per month (via PyPI Stats), at least 13.8 billion downloads/month come from CIs.

Last month, PyPI used 105.5 PB bandwidth, which at list price would be around $12.5 million per month (via PyPA Discord). Fastly very generously gives PyPI a 100% discount on their bill.

At PyCon US this year, Fastly said over 10 years, they have served 4,682.1 billion requests and 2.41 exabytes of data:

image

If astral-sh/setup-uv enabled caching by default, this would significantly help ease the load on PyPI and Fastly, plus it can speed up runs by several minutes.

@eifinger
Copy link
Collaborator

eifinger commented Sep 7, 2024

I will add a more thourough answer soon but als a short teaser on why the default is currently disabled:

  1. The other setup-x actions also default to disabling the cache
  2. Each repo can only have 10GB of cached artifacts and will evict the oldest caches when this is exceeded. This might be unwanted behavior for some users.
  3. I am not entirely sure if the cache counts towards the billed artifacts usage. If so this action would by default incur more costs than expected on private repos.

@eifinger
Copy link
Collaborator

  1. It would be a performance penalty for self-hosted runners on default

@eifinger
Copy link
Collaborator

In a feature release caching will be automatically enabled on Github hosted runners.

I think this strikes a good balance between the arguments outlined so far.

@xmatthias
Copy link

xmatthias commented Dec 29, 2024

@eifinger wouldn't you need to set prune-cache: false too to actually reduce load on pypi?

In my understanding of that setting (which defaults to true btw) - only locally built wheels will be cached - while wheels downloaded from pypi will be pruned - resulting in an empty cache if all packages had wheels on pypi.

Automatically, this will then do not much to reduce the load on pypi / fastlify ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants