Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark: Add sleepBetweenRPCs and connections parameters #6299

Merged
merged 20 commits into from Jun 6, 2023

Conversation

s-matyukevich
Copy link
Contributor

@s-matyukevich s-matyukevich commented May 18, 2023

I am trying to fix #5759 and have a branch with some promising results. However, the problem is that our current benchmarks don't really test the case I am trying to address. The case is the following: we have some grpc servers with a lot of mostly idle connections (in our case those connections mostly come from healtchecks) In this case grpc-go wastes a lot of memory on transport buffers, that aren't shared between connections.

This PR modifies benchmarks so that we can simulate this case. It adds a random delay between subsequent RPCs and a configuration parameter which controls the maximum value of this delay. It also adds a parameter that allows configuring the number of connections.

RELEASE NOTES: none

@s-matyukevich s-matyukevich changed the title Add sleepBetweenRPCs parameter for benchmarks Add sleepBetweenRPCs parameter to benchmarks May 18, 2023
Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please see the question/comment below. Thanks!

Comment on lines 264 to 266
if maxSleep > 0 {
time.Sleep(time.Duration(math_rand.Intn(maxSleep)))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this client code or server code, or is this used for both sides?

I think we might want the sender in both directions to be sleeping, and not sleeping after receiving. If the sender isn't also sleeping, it could fill up buffers and have more impact than you intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I'll remove the delay from the client receive goroutine and instead will add it here

In order to do that I'll need to pass the sleepBetweenRPCs parameter to the server (probably I'll add ti to SimpleRequest rpc)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I end up using metadata to pass the sleep duration to the server. SimpleRequest protobuf is used in too many places across languages, so I am not sure if adding fields to it is a good idea.

@s-matyukevich
Copy link
Contributor Author

@dfawley I fixed your comment, but I also discovered 2 more issues:

  1. Right now benchmarks create a single connection between client and server, which makes it impossible to reproduce the case I described in the PR description. I added one more parameter which allows to use a separate connection per goroutine. Without this change my original PR is completely useless, but let me know if I should split those 2 changes into separate PRs.
  2. benchmarks memory reporting is actually not very useful. We compare total alocations before and after benchmark run and divide them by the number of RPC. This includes allocation that are actually garbage collected and this doesn't give you an idea how much memory your benchamrk is actually using. IMO in order to measure the actual memory usage we need to sample MemStats.Alloc during program execution with some relatively high frequency and then take the average (which is something that pprof does) For now I am just using pprof to analyze memory usage, but let me know if we should add some useful memory-usage indicator to the benchmark output.

@s-matyukevich s-matyukevich changed the title Add sleepBetweenRPCs parameter to benchmarks Add sleepBetweenRPCs and connectionPerConcurrentRPC parameters to benchmarks May 19, 2023
@s-matyukevich s-matyukevich changed the title Add sleepBetweenRPCs and connectionPerConcurrentRPC parameters to benchmarks Add sleepBetweenRPCs and shareConnection parameters to benchmarks May 22, 2023
@arvindbr8 arvindbr8 requested a review from dfawley May 23, 2023 00:09
@arvindbr8 arvindbr8 assigned dfawley and unassigned s-matyukevich May 23, 2023
@dfawley
Copy link
Member

dfawley commented May 23, 2023

This includes allocation that are actually garbage collected

I believe this is by design. More allocations = slower and more frequent GC cycles = worse expected performance.

@@ -111,6 +111,7 @@ var (
serverReadBufferSize = flags.IntSlice("serverReadBufferSize", []int{-1}, "Configures the server read buffer size in bytes. If negative, use the default - may be a a comma-separated list")
serverWriteBufferSize = flags.IntSlice("serverWriteBufferSize", []int{-1}, "Configures the server write buffer size in bytes. If negative, use the default - may be a a comma-separated list")
sleepBetweenRPCs = flags.DurationSlice("sleepBetweenRPCs", []time.Duration{0}, "Configures the maximum amount of time the client should sleep between consecutive RPCs - may be a a comma-separated list")
shareConnection = flag.Bool("shareConnection", true, "Controls whether a single connection or connection per `maxConcurrentCalls` should be used.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, how about a flag for the number of connections, where each connection does maxConcurrentCalls at a time? Hopefully that wouldn't be too different implementation-wise, and would allow for some more interesting testing options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

@github-actions
Copy link

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions bot added the stale label May 31, 2023
@s-matyukevich
Copy link
Contributor Author

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

I think I answered all questions and implemented all suggestions, what other updates are expected?

Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@dfawley dfawley requested a review from zasweq June 6, 2023 16:26
@dfawley
Copy link
Member

dfawley commented Jun 6, 2023

We need a second review before merging PRs from external contributors.

Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nits. LGTM otherwise though. Please fix before merging.

benchmark/stats/stats.go Outdated Show resolved Hide resolved
benchmark/benchmain/main.go Outdated Show resolved Hide resolved
benchmark/benchmark.go Outdated Show resolved Hide resolved
benchmark/benchmain/main.go Outdated Show resolved Hide resolved
benchmark/benchmain/main.go Outdated Show resolved Hide resolved
Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nits. LGTM otherwise though. Please get to those before merging :). Test failure is a flake, not related to this PR.

@zasweq
Copy link
Contributor

zasweq commented Jun 6, 2023

Awesome, thanks for getting to my comments. Feel free to merge (which I think unblocks your other issue) :).

@s-matyukevich
Copy link
Contributor Author

Awesome, thanks for getting to my comments. Feel free to merge (which I think unblocks your other issue) :).

I don't have permissions to merge - please do it for me.

@zasweq zasweq changed the title Add sleepBetweenRPCs and connections parameters to benchmarks benchmark: Add sleepBetweenRPCs and connections parameters Jun 6, 2023
@zasweq
Copy link
Contributor

zasweq commented Jun 6, 2023

Done :).

@zasweq zasweq merged commit 1b66663 into grpc:master Jun 6, 2023
11 checks passed
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

transport: investigate dynamic buffer sizes for the write buffer
4 participants