add custom variant of jetty QueuedThreadPool that allows for configurable pool decay #79

magibney · 2023-03-08T22:29:56Z

This PR is developed against Jetty 10, so would take a bit of extra effort to backport to fs/branch_9_1 (at time of writing still on Jetty 9). I don't plan to bother with this because I expect the staging release branch should catch up to jetty 10 with enough time before a production release.

The main benefit of making this change in Solr repo is that we can run the whole test suite with the new config options/behavior (including integration tests in :solr:packaging:integrationTests). It's also nice that this will place the necessary classes in the right place, so no extra work is required in deployment, etc.

This is a temporary measure, so in order to be compatible with upstream formatting, we're disabling a bunch of format-type checks in CI for the :solr:jetty subproject.

See also:
jetty/jetty.project#9237
jetty/jetty.project#9354

…558978f)

justinrsweeney · 2023-03-13T18:37:49Z

solr/jetty/src/java/org/apache/solr/jetty/InstrumentedQueuedThreadPool.java

+    private String prefix;
+
+    public InstrumentedQueuedThreadPool(@Name("registry") MetricRegistry registry) {
+        this(registry, 200);


Would it be worth pulling some of these: maxThreads, minThreads, idleTimeout as constants for the default values?

Definitely, if this were our code; but I'm keeping it as close as possible to the upstream class it modifies, so I think prefer to leave it as-is.

justinrsweeney · 2023-03-13T18:42:48Z

solr/test-framework/src/java/org/apache/solr/embedded/JettySolrRunner.java

@@ -283,6 +283,7 @@ private void init(int port) {
    QueuedThreadPool qtp = new QueuedThreadPool();
    qtp.setMaxThreads(THREAD_POOL_MAX_THREADS);
    qtp.setIdleTimeout(THREAD_POOL_MAX_IDLE_TIME_MS);
+    qtp.setIdleTimeoutDecay(THREAD_POOL_MAX_THREADS);


Do we want a different constant for idleTimeoutDecay or should this actually be tied to maxThreads?

For JettySolrRunner it probably doesn't matter. For clarity though I agree -- propose to make a different constant, same value?:

private static final int THREAD_POOL_MAX_THREADS = 10000; private static final int THREAD_POOL_DECAY_RATE = THREAD_POOL_MAX_THREADS;

👍 on a new constant, same value. Provides better clarity.

hiteshk25 · 2023-03-17T22:53:12Z

solr/jetty/src/java/org/apache/solr/jetty/QueuedThreadPool.java

+// ========================================================================
+//
+
+package org.apache.solr.jetty;


Curious, have made any change to this file for our purpose?

Yes. The commits are logically split to clarify what's pulled in from upstream and different changes -- functional (48c971f) and rote (a5a039b). This is complicated a bit while the upstream PR is in-flight, because the substance of the changes are in the first of these two commits, which (maybe misleadingly) refers to "changes from upstream Jetty PR". To clarify, these changes are not yet merged upstream, but we're trying to track here the changes as proposed in the upstream PR.

solr/jetty/src/java/org/apache/solr/jetty/QueuedThreadPool.java

hiteshk25 · 2023-03-17T23:36:50Z

solr/jetty/src/java/org/apache/solr/jetty/ShrinkManager.java

+ * in a <code>finally</code> block) that any outstanding entries in this {@link ShrinkManager} are
+ * removed if the calling thread exits unexpectedly (i.e., throws an exception).
+ */
+public class ShrinkManager


QQ. In every idleTimeoutDecay interval we shrink one thread now? earlier it used to happen for every idleTimeout interval?

Right. So there's now a distinction between the two values, with idleTimeout being interpreted as the minimum amount of time that excess pool capacity must be idle before it's eligible to shrink, and idleTimeoutDecay being the rate at which excess capacity shrinks once it's eligible to shrink according to idleTimeout.

Sure. let me take a example for it.

minthreads =1

idletimeout= 60 seconds

idleTimeoutDecay (i want 5 seconds after quick burst of requests - not clear how I will configure that)

===
let's say 10 threads created. Now how threads will shrink by time
65,70,75,80,85,90,95,100, 105 - (1 thread will remain)
is my understanding correct?

Exactly. The behavior described in your example would be achieved by setting idleTimeoutDecay=12 -- i.e., 12 threads will be allowed to decay for every idleTimeout (60s) interval, or one thread per 60s / 12 = 5s.

.by setting idleTimeoutDecay=12 -- i.e., 12 threads will be allowed to decay for every idleTimeout (60s) interval, or one thread per 60s / 12 = 5s

This part is not clear to me. I was expecting "5 seconds" decay setting to shrink after idletimeout. So if someone sets 0 seconds, then all threads will shrink immediately after idletimeout.

Help me to understand this better.

I think the confusion stems from the fact that idleTimeoutDecay is ultimately a number of threads, not an amount of time. So If you want all threads to shrink immediately after idleTimeout has passed, you'd set idleTimeoutDecay equal to maxThreads. Could set it higher, but it shouldn't make a difference practically speaking. So idleTimeoutDecay=1 would be equivalent to current behavior, where 1 thread is allowed to expire per idleTimeout interval.

Yes, first end user doesn't expect behavior of current implementation of idletimeout. Now, it feels to me this new parameter will complicate the things more.

I think it would be simple if we can consider new parameter as timeout decay

default timeout decay will be idletimeout

Easier to understand

More importantly, there is minthreads parameter to keep minimum threads alive.

Also, it may simplify the implementation.

I'm not sure what the desired behavior you have in mind is?

I'm thinking of following two things

keep current behavior as it is (decaytimeout=ideltimeout)

if thread is idle for idletimeout then shrink it (decaytimeout =0). That's what most users(all) expects

In-between, decay every 5 seconds as described in above example.

I was looking this commit. That seems bit simple implementation

hiteshk25 · 2023-03-24T22:59:00Z

solr/jetty/src/java/org/apache/solr/jetty/ShrinkManager.java

+            while (NanoTime.elapsed(last, now) > siNanos)
+            {
+                // shrinkInterval is satisfied
+                if (lastShrink.compareAndSet(last, Math.max(last, idleBaseline) + siNanos))


What will happen with following usecase.

process is running with min-threads.

sudden activity happens and it create N more threads. Now total threads are (min-threads + N)

Lets assume following things happen
3.1 all timestamps has same idleBaseline (Or 3.1.1 all threads come at same time here)
3.2 all threads will get same idleBaseline, and one will succeed here (most important only one thread will succeed as RETRY_LIMIT = 0?)
3.3 except one thread, all thread will continue in QTP.run() method and wait for idletimeout for next job?
3.4 then (3.1 - 3.3) again happens after ideltimeout.

Can this cause same behavior as today, where each thread shrinks after idletimeout?

The situation you're describing is addressed by the comment immediately below:

// NOTE: attempted CaS may fail, _very_ infrequently. If it does, that's fine. // This is a "best effort" approach to shrinking, and even if our CaS fails, // the missed "shrink reservation" will very likely simply be picked up by // another thread.

Worst-case scenario here is that an opportunity to shrink is missed. Even in tests where we're specifically trying to trigger this behavior, such as testPrecise(), the behavior manifests infrequently. But we are able to trigger it sometimes, and this is the reason for the existence of the adjustable RETRY_LIMIT static field: in artificial situations where we care about shrink behavior with extreme precision (really only writing tests I suspect), we need to retry to avoid the race condition you're describing.

But "in real life":

situations triggering this issue will be rare, and

even if the issue is hit, the more threads, the more likely it is that the "shrink reservation" will be picked up by another thread, and

even in the event we hit this issue in a way that fully manifests, the consequence is minor to the point of being benign

hiteshk25 · 2023-03-24T23:14:05Z

solr/jetty/src/java/org/apache/solr/jetty/QueuedThreadPool.java

+                        {
+                            // No job immediately available maybe we should shrink?
+                            long idleTimeout = getIdleTimeout();
+                            if (idleTimeout > 0 && getThreads() > _minThreads)


The other usecase is here

what if all threads checks this condition same time and goes for pollIdleShrink() method

Now, is it possible pollIdleShrink returns true for all the threads? if getShrinkInterval() is very small and each thread able to set lastShrink ?

True, good catch! I dug into this a bit on the upstream PR (thanks for reporting there!), and even writing a test specifically designed to trigger this case (thread count falling below minThreads), I was only ever able to get it to drop 1 thread below, and the ensureThreads() method in the finally block of the run loop brings the thread count back in line with configured minThreads.

I'm inclined to go with the upstream, which appears to be leaning towards not worrying about this.

magibney added 5 commits March 8, 2023 17:11

base files for jetty@10.0.13 and dropwizard@4.2.12

b717c89

add changes from upstream jetty PR (623021e3e1ea91f981ae53f4b0ad5aa0d…

48c971f

…558978f)

modifications for the solr context

a5a039b

actually use the new impls

05f7be7

minor adjustments to make errorprone happy

fd3a7b7

justinrsweeney reviewed Mar 13, 2023

View reviewed changes

hiteshk25 reviewed Mar 17, 2023

View reviewed changes

solr/jetty/src/java/org/apache/solr/jetty/QueuedThreadPool.java Show resolved Hide resolved

hiteshk25 reviewed Mar 17, 2023

View reviewed changes

hiteshk25 reviewed Mar 24, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add custom variant of jetty QueuedThreadPool that allows for configurable pool decay #79

add custom variant of jetty QueuedThreadPool that allows for configurable pool decay #79

magibney commented Mar 8, 2023 •

edited

justinrsweeney Mar 13, 2023

magibney Mar 17, 2023

justinrsweeney Mar 13, 2023

magibney Mar 17, 2023

justinrsweeney Mar 18, 2023

hiteshk25 Mar 17, 2023

magibney Mar 20, 2023

hiteshk25 Mar 17, 2023

magibney Mar 20, 2023

hiteshk25 Mar 20, 2023

magibney Mar 20, 2023

hiteshk25 Mar 20, 2023

magibney Mar 20, 2023

hiteshk25 Mar 20, 2023

magibney Mar 20, 2023

hiteshk25 Mar 20, 2023 •

edited

hiteshk25 Mar 20, 2023

hiteshk25 Mar 24, 2023

magibney Mar 27, 2023

hiteshk25 Mar 24, 2023

magibney Mar 27, 2023

add custom variant of jetty QueuedThreadPool that allows for configurable pool decay #79

Are you sure you want to change the base?

add custom variant of jetty QueuedThreadPool that allows for configurable pool decay #79

Conversation

magibney commented Mar 8, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hiteshk25 Mar 20, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

magibney commented Mar 8, 2023 •

edited

hiteshk25 Mar 20, 2023 •

edited