Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple QTP idleTimeout from pool shrink rate #9237

Closed
magibney opened this issue Jan 31, 2023 · 13 comments
Closed

Decouple QTP idleTimeout from pool shrink rate #9237

magibney opened this issue Jan 31, 2023 · 13 comments

Comments

@magibney
Copy link

Enhancement Description

It should be possible to independently configure the rate at which QueuedThreadPool, having seen a transient spike in number of threads, shrinks back to a normal/baseline number of threads.

This issue has been raised several times; #4585 provides excellent context.

Afaict, this comment sums up the current approach:

The code is doing the slow shrinking by design: the idea is that if you had a load spike that caused the creation of lots of threads, you don't want to shrink too aggressively because you can have another load spike.
Therefore we only shrink 1 thread every idleTimeout rather than all of those that are idle.

This makes sense when idle threads are guaranteed to be cheap. For the purpose of this issue, let's acknowledge that this should always be the case, and that cases where persistently large thread pools result in a bottleneck on system resources are arguably better addressed in some other way.

Even acknowledging the above however, it still makes sense in principle to decouple the rate of pool shrinkage from the amount of time that individual threads are allowed to remain idle before being eligible to be reclaimed.

Paraphrasing from this comment (I think the math in the original comment was inverted!):

The threadpool already has its own idleTimeout, which can be configured to an aggressive small number if desired. There is very little difference between shrinking 100 threads every 60000ms vs 10 threads every 6000ms vs 1 thread every 600ms.

This is a good example of the options currently available; but I'd argue there's a significant difference between the alternatives suggested. There's obviously no difference at the 60000ms mark: starting with 100 idle threads, each alternative would hit 0 threads at 60s. But notably, the first option would retain all 100 threads all the way out to the 60s point, and would thus be better able to respond to certain types of bursty traffic.

For the purpose of discussion I pushed a single commit that I think should address this.

@sbordet
Copy link
Contributor

sbordet commented Jan 31, 2023

What is exactly the problem this enhancement would address?

I understand you would like to shrink the thread pool more aggressively, but why exactly?

@magibney
Copy link
Author

In the particular case I'm dealing with, the "idle threads are cheap" assumption does not hold. We're using Solr, and under the hood Lucene has historically used ThreadLocal caching in some cases. This is fixed in the most recent Lucene version, so the problem will be best addressed in time by avoiding pathological usage of ThreadLocal caching.

That said, I'm raising this issue here because in principle I think it's probably the right thing to do to allow configuring pool decay in a way that's independent from individual thread idleTimeout.

A little more detail/context: we see certain isolated events that generate a spike in thread count (from ~2k baseline to ~4k or higher). Ideally we don't want to reduce the idleTimeout (currently set to 120s) for the standard reasons of wanting to be able to handle bursty traffic. But with the current idleTimeout setting, it takes nearly 3 days for a 2k spike in threads to fall back to baseline levels. We're evaluating possibly substantially reducing the idleTimeout as the currently available workaround -- but this itself feels like a workaround to a workaround (ideally we wouldn't need to compromise on individual thread idleTimeout in order to achieve faster pool decay).

Again, I'm not really trying to "solve" this problem -- the underlying problem is not a problem with Jetty (this is why my initial description was a bit vague wrt the motivating case -- sorry!). But it's a problem that at a high level ("threads are not always cheap") is not unique to this situation, and that could I believe be readily worked around by decoupling idleTimeout from pool shrinkage, separating two concepts that, although related, are conceptually distinct anyway.

@joakime
Copy link
Contributor

joakime commented Jan 31, 2023

What specifically are you worrying about?

This seems like worrying about memory usage of idle threads. (which can be addressed in different ways)

@magibney
Copy link
Author

memory usage of idle threads (which can be addressed in different ways)

Exactly; for my particular case this is the essence of the problem, and without a doubt the best/right way to deal with the situation is to eliminate reliance on ThreadLocal caching.

Are there ways to address memory usage of idle threads, beyond "prevent included libraries from accessing ThreadLocal storage"? That's the right approach, but there are cases where it's hardly practical, and in the immediate term (in cases without influence over included libraries) may be effectively impossible.

My particular problem will in time be solved in another way. I'm raising this issue here mainly on principle, because:

  1. the concepts of individual thread idleTimeout and pool decay rate seem fundamentally different to me, so having them be independently configurable makes logical sense
  2. it would be trivial (I think?) to support this distinction in a way that is backward compatible, with no change in default behavior
  3. it will provide an escape valve (and a logically consistent/non-hacky one, at that) for users who find themselves in a situation where per-thread resource usage is beyond their control (or in contexts where pool size is more of a concern for some other reason)

@joakime
Copy link
Contributor

joakime commented Jan 31, 2023

The memory usage of a thread is determined by it's stack depth.
Jetty's QTP keeps their idle thread stack depth intentionally small because of this.
If you were to compare a Spring Boot thread memory usage to a Jetty QTP idle thread memory usage you would see a dramatic difference.

The JVM supports configurations to limit this and reduce the memory load of threads overall.

See:

  • The -XX:+PrintFlagsFinal output and the values for ThreadStackSize
  • The -Xss options.
  • The output of $ java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -XX:+PrintNMTStatistics -version

Note also that Java 8 is worse than Java 11 which is worse than Java 17 when it comes to memory use for threads.

@sbordet
Copy link
Contributor

sbordet commented Jan 31, 2023

The memory usage of a thread is determined by it's stack depth.

No. For platform threads, this is set by default at 1 MiB, independently from the stack depth.
So for 4 k threads this is 4 GiB of native memory "used" by the JVM.
This is one of the benefits that virtual threads offer wrt platform threads.

It's often unpractical to specify -Xss, so I think we need to offer a solution for this, probably along the lines of what the OP suggested.

@magibney
Copy link
Author

Thank you both for your feedback/consideration! Indeed, on-heap ThreadLocal concerns are in addition to any JVM-level or system level impacts from persistent high thread count. It would be fantastic to have this addressed going forward; I'm happy to offer assistance, to whatever extent that would be helpful.

@magibney
Copy link
Author

magibney commented Feb 7, 2023

Just a quick check-in to make sure this doesn't fall off the radar. If you're interested in moving forward with this enhancement, are there any open questions/concerns with the approach proposed? Any things to keep in mind if I were to try to update the proposed implementation to make it suitable for contribution?

@sbordet
Copy link
Contributor

sbordet commented Feb 7, 2023

I like the idea of having an additional configuration parameter that reduces the number of threads in case they are idle.

However, a "shrinkInterval" does not seem very intuitive to me... I think I prefer a "count" but not sure how would that be implemented.
I would like to ponder a bit more and try to find a better abstraction.

Keep pushing on this feature request, but understand that's not super-high priority.

@magibney
Copy link
Author

magibney commented Feb 7, 2023

Perfect, thanks!

One of the reasons I went with shrinkInterval was that it was extremely straightforward to implement (whereas something "count-based" looked like it'd be much trickier). Another benefit of the shrinkInterval approach (as opposed to something count-based) is that it decays more predictably (not jumpy).

Here's a thought: rather than force users to manually specify an absolute shrinkInterval shrinkBatchSize, perhaps it'd be more intuitive at the user API level to have the setting shrinkRate or decayRate, in the static range of 1-1000, where 1 => slow decay, 1000 => fast decay. This feels more intuitive; but perhaps more importantly, this doesn't lock into any particular implementation. No guarantees would need to be made regarding the exact contour of the decay, so if it's straightforward and practical to take a shrinkInterval approach as an implementation detail now, we could go ahead and do that now, while remaining free at any point in the future to switch to a shrinkBatchSize approach under the hood.

Running with this idea a bit: it'd be a natural fit to map 1 => "shrinkInterval == idleTimeout -- equivalent to the current behavior; and map 1000 => shrinkInterval == 0 -- any thread that has itself been idle for idleTimeout ms will expire immediately.

@hiteshk25
Copy link

How about having a boolean flag strictIdleTimeout? To add, we (or many others) are having this issue in production where a node consumes lots of thread local memory, and then we need to restart that node.

@magibney
Copy link
Author

@sbordet, after some consideration, I think your suggestion to have the API deal with "expireThreadCount (or perhaps idleTimeoutDecay?)" is indeed the way to go. It actually plays quite nicely with an under-the-hood implementation based on shrinkInterval!

I've pushed a new commit that illustrates this approach. The tricky thing implementation-wise is that the threads independently expire themselves, so any approach to expiring may neither be truly "batched" (expire N threads once per X interval) nor "linear" (expire N threads evenly spread across X interval).

The commit above does a very simple conversion of idleTimeout / idleTimeoutDecay to determine the minimum interval between pool shrinkage, then increments the pool's _lastShrink timestamp by that amount, starting from a baseline of the greater of the extant _lastShrink timestamp or now - idleTimeout. Thus the window keeps moving forward, but allowing for more frequent removals.

@gregw
Copy link
Contributor

gregw commented Feb 15, 2023

This is looking plausible.... can you make a draft PR so that we can use the tools to review/modify it?

sbordet added a commit that referenced this issue Mar 14, 2023
Introduced `QueuedThreadPool.idleTimeoutMaxShrinkCount` to be the number of idle threads that are exited in one idle timeout.

When set to 1 (the default), the old behavior is reproduced: expiring 1 thread every idle timeout.
When set to larger values, allows to keep around the threads for the idle timeout (in case of further load spikes), but allows to quickly recover OS memory when they are truly idle.

For example, with 2000 threads, 30 seconds idle timeout and idleTimeoutMaxShrinkCount=1, it will take 995 minutes (about 16.5 hrs) to shrink the pool back to 10 threads.
By setting idleTimeoutMaxShrinkCount=100, the thread pool can be shrunk to 10 threads in about 10 minutes.

Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
lorban added a commit that referenced this issue Mar 21, 2023
Signed-off-by: Ludovic Orban <lorban@bitronix.be>
lorban added a commit that referenced this issue Mar 21, 2023
Signed-off-by: Ludovic Orban <lorban@bitronix.be>
lorban added a commit that referenced this issue Mar 22, 2023
…eout period should be observed or not

Signed-off-by: Ludovic Orban <lorban@bitronix.be>
lorban added a commit that referenced this issue Mar 23, 2023
…ch block and that the interrupt flag is always cleared when it returns, also make aggressive shrinking the only possibility

Signed-off-by: Ludovic Orban <lorban@bitronix.be>
lorban added a commit that referenced this issue Mar 23, 2023
Signed-off-by: Ludovic Orban <lorban@bitronix.be>
lorban added a commit that referenced this issue Mar 24, 2023
Signed-off-by: Ludovic Orban <lorban@bitronix.be>
lorban added a commit that referenced this issue Mar 27, 2023
… minThread + comment why that's safe

Signed-off-by: Ludovic Orban <lorban@bitronix.be>
lorban added a commit that referenced this issue Mar 27, 2023
… minThread + comment why that's safe

Signed-off-by: Ludovic Orban <lorban@bitronix.be>
lorban added a commit that referenced this issue Mar 28, 2023
…e at least one idleTimeout period without shrinking

Signed-off-by: Ludovic Orban <lorban@bitronix.be>
lorban added a commit that referenced this issue Mar 28, 2023
…e at least one idleTimeout period without shrinking

Signed-off-by: Ludovic Orban <lorban@bitronix.be>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants