Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU count logic does not distinguish "usable" CPUs inside cgroup'd containers (e.g. inside k8s) #2978

Closed
jdb8 opened this issue Aug 28, 2023 · 1 comment · Fixed by #2979
Closed

Comments

@jdb8
Copy link
Contributor

jdb8 commented Aug 28, 2023

search you tried in the issue tracker

multiprocessing, k8s, usable, cpu

describe your issue

When invoking pre-commit from within a container environment with cgroup cpu enforcement, pre-commit fans out more than expected (based on the total number of CPUs rather than the number of usable CPUs). This ends up causing unexpected behaviour and makes it hard to optimise for performance when running hooks in CI inside containers (since pre-commit's parallelism is non-configurable), which leaves us with either "shard using too many CPUs" or require_serial: true, both of which aren't ideal.

The cause is seemingly due to the implementation in xargs.py:

return multiprocessing.cpu_count()
which uses multiprocessing.cpu_count(), which is documented to not [be] equivalent to the number of CPUs the current process can use. This is confirmed by running python3 -c 'import multiprocessing; print(multiprocessing.cpu_count())' inside the container environment: the number is higher than expected.

From the docs, it looks like a cgroup-compatible way of grabbing the number of usable CPUs would be to use len(os.sched_getaffinity(0)), but I don't know if that has undesirable downsides. I also don't know if this would be a disruptive breaking change to anyone relying on the old behaviour, so I wanted to make this issue first to get your thoughts.

pre-commit --version

pre-commit 2.19.0

.pre-commit-config.yaml

n/a - can provide a dummy config invoking `python3 -c 'import multiprocessing; print(multiprocessing.cpu_count())'` if more info needed, but this isn't specific to any one hook or config option

~/.cache/pre-commit/pre-commit.log (if present)

No response

@asottile
Copy link
Member

sched_getaffinity isn't available on other platforms but could probably be used if available

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants