You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This isn't a question or user support case (For Q&A and community support, go to Discussions).
I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
To Reproduce
1. Set `spec.maxRunners` to `0` for the `AutoscalingRunnerSet`.
2. Start workflow jobs
3. Note that the listener acquires the jobs despite lacking the capacity to process them
Describe the bug
According to the official docs, the listener is supposed to check if it's able to scale up to the desired count before acquiring a job:
When the Runner ScaleSet Listener receives the Job Available message, it checks whether it can scale up to the desired count. If it can, the Runner ScaleSet Listener acknowledges the message
However, when we set maxRunners to 0 to prevent listeners from acquiring new jobs while waiting for the completion of the jobs that are currently in progress, we noticed that the listener continues to acquire jobs despite the autoscaling settings (see logs).
We have a multi-cluster setup and expected all jobs to be handled by one cluster, while another has maxRunners set to 0. But, this resulted in queued jobs, as a listener was still acquiring the jobs.
I don't see any checks in the listener code before it acquires the jobs
We are aware of this problem, and we are gradually working on the fix. This problem is the reason why we added capacity information in the latest release, so the server can be aware of it, and will not offer any jobs to the listener that it won't be able to handle. I will however bring this issue to the team, to check if we should do something on the listener before server changes are ready.
Checks
Controller Version
0.9.0
Deployment Method
Helm
Checks
To Reproduce
Describe the bug
According to the official docs, the listener is supposed to check if it's able to scale up to the desired count before acquiring a job:
However, when we set
maxRunners
to0
to prevent listeners from acquiring new jobs while waiting for the completion of the jobs that are currently in progress, we noticed that the listener continues to acquire jobs despite the autoscaling settings (see logs).We have a multi-cluster setup and expected all jobs to be handled by one cluster, while another has
maxRunners
set to0
. But, this resulted in queued jobs, as a listener was still acquiring the jobs.I don't see any checks in the listener code before it acquires the jobs
actions-runner-controller/cmd/githubrunnerscalesetlistener/autoScalerService.go
Lines 100 to 205 in f7eb88c
Describe the expected behavior
The listener should verify autoscaling settings to ensure it can handle a job before acquiring it
Listener Logs
https://gist.github.com/prizov/64a1045e83cb8b1612087109602cc2c1
Note the scaling settings on line 10 https://gist.github.com/prizov/64a1045e83cb8b1612087109602cc2c1#file-gistfile1-txt-L10
The text was updated successfully, but these errors were encountered: