Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix #3282] idle-timeout not waiting on all workers in cluster mode #3283

Conversation

joshuay03
Copy link
Contributor

@joshuay03 joshuay03 commented Nov 28, 2023

Description

Closes #3282

Ensures that the master process (and therefore the entire cluster) is only gracefully shut down when all workers have reached the idle timeout when in cluster mode, as opposed to any worker (current implementation).

Also adds a log to indicate that the idle time out has been reached in both modes
Puma starting in single mode...
* Puma version: 6.4.0 (ruby 3.2.2-p53) ("The Eagle of Durango")
*  Min threads: 0
*  Max threads: 5
*  Environment: development
*          PID: 70263
* Listening on http://0.0.0.0:9292
Use Ctrl-C to stop
- Idle timeout reached # 👈🏽
- Gracefully stopping, waiting for requests to finish
=== puma shutdown: 2023-11-29 16:15:05 +1000 ===
- Goodbye!
[70332] Puma starting in cluster mode...
[70332] * Puma version: 6.4.0 (ruby 3.2.2-p53) ("The Eagle of Durango")
[70332] *  Min threads: 0
[70332] *  Max threads: 5
[70332] *  Environment: development
[70332] *   Master PID: 70332
[70332] *      Workers: 2
[70332] *     Restarts: (✔) hot (✔) phased
[70332] * Listening on http://0.0.0.0:9292
[70332] Use Ctrl-C to stop
[70332] - Worker 0 (PID: 70346) booted in 0.0s, phase: 0
[70332] - Worker 1 (PID: 70347) booted in 0.0s, phase: 0
[70332] - All workers reached idle timeout # 👈🏽
[70332] - Gracefully shutting down workers...
[70332] === puma shutdown: 2023-11-29 16:15:35 +1000 ===
[70332] - Goodbye!

FWIW I've manually tested this against the Rails monolith I work on and it's looking good!

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added (or updated) appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

@joshuay03 joshuay03 force-pushed the clustered-idle-shutdown-not-waiting-on-all-workers branch 8 times, most recently from 238a792 to d32eb90 Compare November 28, 2023 07:00
@joshuay03
Copy link
Contributor Author

joshuay03 commented Nov 28, 2023

turbo-rails CI failures seem to be coming from upstream: https://github.com/hotwired/turbo-rails/actions/runs/6979009463/job/18991454469

Not sure about the jruby and truffleruby failures though.

@joshuay03 joshuay03 force-pushed the clustered-idle-shutdown-not-waiting-on-all-workers branch 7 times, most recently from 61b5160 to 6bede5c Compare November 28, 2023 23:33
@joshuay03 joshuay03 marked this pull request as ready for review November 28, 2023 23:33
@joshuay03 joshuay03 force-pushed the clustered-idle-shutdown-not-waiting-on-all-workers branch 4 times, most recently from f3a24dd to 264ecf8 Compare November 29, 2023 06:56
@joshuay03 joshuay03 force-pushed the clustered-idle-shutdown-not-waiting-on-all-workers branch from 264ecf8 to d0e4227 Compare November 30, 2023 08:47
@MSP-Greg
Copy link
Member

@joshuay03

Thanks for working on this. I haven't reviewed all of it.

One thing. We cannot change the method signature of Server.new, as it will break a lot of external code that uses just a Puma::Server. I'd suggest adding it to options before the call to Server.new, then removing it in Server.new.

@joshuay03
Copy link
Contributor Author

joshuay03 commented Nov 30, 2023

One thing. We cannot change the method signature of Server.new, as it will break a lot of external code that uses just a Puma::Server. I'd suggest adding it to options before the call to Server.new, then removing it in Server.new.

That's fair, I'll update.

Edit: Adressed in https://github.com/puma/puma/compare/df23fbf1a7d855f3158c2d3976592b6f5dfe7200..0749bd7dbe7ee655e0f86866a4a0dbe882b6c38b

@joshuay03 joshuay03 force-pushed the clustered-idle-shutdown-not-waiting-on-all-workers branch 3 times, most recently from 0749bd7 to e90fe64 Compare November 30, 2023 23:52
@joshuay03 joshuay03 force-pushed the clustered-idle-shutdown-not-waiting-on-all-workers branch from e90fe64 to 2bd0b49 Compare November 30, 2023 23:55
Comment on lines -229 to +232
connect
10.times {
fast_connect
sleep 0.5
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated test fails on master with:

Screenshot 2023-12-01 at 9 56 34 am

@nateberkopec nateberkopec merged commit f70bebd into puma:master Jan 2, 2024
55 of 64 checks passed
@joshuay03 joshuay03 deleted the clustered-idle-shutdown-not-waiting-on-all-workers branch January 2, 2024 22:46
@dentarg dentarg added the bug label Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

--idle-timeout should not shut down entire cluster if any single worker is idle
5 participants