Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally crash benchmark QPS client worker if it fails to connect to the server target. #36511

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AshZhang
Copy link

@AshZhang AshZhang commented May 2, 2024

By default, the QPS worker will crash if it cannot connect to the server target while operating as a client.

When explicitly set to false, the die_on_connection_failure flag (true by default) will keep the QPS worker process running if it fails to connect to the server target. Instead of dying, the QPS worker will finish the client threads, cancel the context to end the benchmark run and terminate the QPS JSON driver, and revert to the idle state of waiting for a RunClient or RunServer invocation.

This is useful in benchmark setups where runs are invoked continuously on long-lived QPS worker processes, reducing spurious worker deaths and the need for manual restarts due to transient connectivity failures.

CC: @apolcyn

test/cpp/qps/qps_worker.cc Outdated Show resolved Hide resolved
grpc_core::Crash("Client failed to connect to all channels");
} else {
client->AwaitThreadsCompletion();
ctx->TryCancel();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think TryCancel is necessary?

Simply terminating the RPC with the status code and message below seems ideal

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TryCancel ensures that the stream passed in from the QPS JSON driver is closed, so that the driver will terminate if the client fails to connect. If the TryCancel isn't there, the driver will hang forever waiting for the client to write the initial status.

test/cpp/qps/qps_worker.h Outdated Show resolved Hide resolved
test/cpp/qps/qps_worker.h Outdated Show resolved Hide resolved
test/cpp/qps/client.h Outdated Show resolved Hide resolved
test/cpp/qps/client.h Outdated Show resolved Hide resolved
test/cpp/qps/qps_worker.h Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants