Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection pool's TaskRunner wakes up frequently and drain CPU #6838

Closed
amitbalbule opened this issue Aug 25, 2021 · 11 comments
Closed

Connection pool's TaskRunner wakes up frequently and drain CPU #6838

amitbalbule opened this issue Aug 25, 2021 · 11 comments
Milestone

Comments

@amitbalbule
Copy link

amitbalbule commented Aug 25, 2021

Hello.

Configuring a connection pool with a small enough value of keep-alive duration results in high CPU consumption and battery drainage (on Android).
Issue observed in version 4.9.1 (did not test other versions).

Specifically, the use case is creating a client with a connection pool such as:
ConnectionPool(0, 1, TimeUnit.NANOSECONDS)

Although this configuration essentially means no connection pool (size 0 and short-lived idle connections), it should not be consuming CPU indefinitely.
Looking at the app's profiling - TaskRunner threads seem to be the cause for the high CPU usage.
Using 1ms mitigates it (but not solve it), and so on.

Talking with @swankjesse it was pointed out that the implementation might use the keep-alive value as the max time between runs.
(I did not post a code snippet or a repro code since this issue seems to be clear).

As a work-around, using 1 minute for keep-alive solves it and no significant CPU consumption is observed.

Thank you.

@amitbalbule amitbalbule added the bug Bug in existing code label Aug 25, 2021
@yschimke yschimke added enhancement Feature not a bug and removed bug Bug in existing code labels Aug 26, 2021
@yschimke
Copy link
Collaborator

There is a lot we could do here if we decided to make OkHttp connected to network and foreground/background events in Android. But it's always been out of scope for us. Without making this Android specific there is a tension between the best options for Server to Server traffic and Mobile to Server.

@swankjesse
Copy link
Member

There might be a bug where we wake up every timeout even if there’s no work to do. We don’t generally encounter this because timeout is usually infrequent.

@amitbalbule
Copy link
Author

Is this only in case there are active connections, or regardless?

@swankjesse swankjesse added this to the 5.0 milestone Nov 20, 2021
@yschimke
Copy link
Collaborator

yschimke commented Nov 28, 2021

I was able to reproduce this with the ConnectionPool settings you've defined, only if I deliberately avoid completing a request (reading body or closing it). In this case the churn on the TaskRunner is much higher with a keepAliveDuration of 1ns, vs 1 second or 5 minutes.

I would suggest to use a value higher than 1ns to avoid this churn.

The TaskRunner keeps waking up (roughly tied to keepAliveDuration) if

  • The request is incomplete

Pings keep sending if

  • Default connection pool is used and no call to connectionPool.evictAll is made

But with the connection pool set as described it seems to work correctly, no obvious activity from the TaskRunner when responses are correctly closed.

yschimke@0c838c9

@yschimke yschimke added the bug Bug in existing code label Nov 28, 2021
@yschimke
Copy link
Collaborator

RealConnectionPool cleanupTask deliberately returns this keepAliveDurationNs that you have set, this is essentially working as designed.

      inUseConnectionCount > 0 -> {
        // All connections are in use. It'll be at least the keep alive duration 'til we run
        // again.
        return keepAliveDurationNs
      }

@yschimke yschimke removed bug Bug in existing code enhancement Feature not a bug labels Nov 28, 2021
@swankjesse
Copy link
Member

@yschimke right now the task runner keeps waking up if there's only active connections so it can detect if they've leaked. This doesn't need to be as frequent as keepAliveDurationNs, especially if that duration is really short. Maybe we change this to return the max of the keepAliveDuration and one minute? That'll keep the pool mostly idle when all it's doing is checking for leaks.

@yschimke
Copy link
Collaborator

yschimke commented Dec 4, 2021

For discussion #6951

@yschimke
Copy link
Collaborator

yschimke commented Dec 4, 2021

@swankjesse This seems high value to fix, OkHttp should let you configure something that hurts performance.

@amitbalbule
Copy link
Author

Thank you for this @yschimke

Is it true that a leaked connection (due to an application bug) will never be evicted (unless application holds no references to it anymore), and so if only "inUseConnections" are in the pool, the TaskRunner will be woken up indefinitely ?
(every keepAliveNs, or as @swankjesse suggested, max(keepalive, 1min) ?)
This has nothing to do with the pool size - is this correct? (since this is an "inUseConnection" rather than idle).

If so, and if I understand correctly - @swankjesse you say that an app bug should only affect the app's resources, rather than the okHttp lib performance - even if it is being misused?

@swankjesse
Copy link
Member

It currently wastes resources when the keepAliveDurationNs is very short.

@yschimke
Copy link
Collaborator

I was able to reproduce this with the ConnectionPool settings you've defined, only if I deliberately avoid completing a request (reading body or closing it).

This should* be fixed with #7801, and also is only impacting if keepAliveDurationNs is small.

  • It's possible other things are going on here, so let's reopen if reoccuring after 5.0.0-alpha.12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants