Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: make comparison of server configs in bootstrap more reliable #6112

Merged
merged 2 commits into from Mar 15, 2023

Conversation

easwars
Copy link
Contributor

@easwars easwars commented Mar 10, 2023

When the clusterimpl LB policy receives a config update, it checks if the load_reporting_server configuration has changed. If so, it closes the existing LRS stream and creates a new one.

Unfortunately this equality comparison was broken because the bootstrap.ServerConfig struct contains a grpc.DialOption to store the configured credentials. The implemention of this dial option (like many others) uses a function pointer which is not very amenable to comparisons. So, even if two configuration specify the same transport credentials, the equality comparison will fail.

This PR addresses this issue and ensures that the LRS stream is not closed and re-created when the actual load_reporting_server config has not changed.

Summary of changes:

  • Add a ChannelCreds type in the bootstrap package to represent channel_credentials specified as part of the server config.
    • This stores the credentials' type and configuration and makes it possible to compare credentials based on their config instead of based on the dial option
  • Update the ServerConfig type in the bootstrap package in the following ways:
    • Store channel credentials using the newly added ChannelCreds type.
    • Also, store the built credentials as a unexported field and make it available via a method.
    • Add a convenience function to create a server config from JSON, ServerConfigFromJSON.
    • Store the server_features as part of the server config. This also needs to be used for equality comparisons.

This fixes an internal P1 issue that is blocking the federation release.

Fixes #6097

RELEASE NOTES:

  • xds/clusterimpl: Fix a bug causing unnecessary closing and re-opening of LRS streams

@easwars easwars requested a review from dfawley March 10, 2023 18:28
@easwars easwars added this to the 1.54 Release milestone Mar 10, 2023
@easwars
Copy link
Contributor Author

easwars commented Mar 10, 2023

@arvindbr8 : FYI, this change could cause merge conflicts with your A53 PR. Also, I don't remember where you are storing the server_features in that PR. But it would be a better idea to store it as part of the ServerConfig as introduced in this PR.

// Also ensure that a new LRS stream is not created.
sCtx, sCancel = context.WithTimeout(ctx, defaultTestShortTimeout)
defer sCancel()
if _, err := mgmtServer.LRSServer.LRSStreamOpenChan.Receive(ctx); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be sCtx, right? Otherwise this waits 10 seconds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, thanks.

Comment on lines 159 to 162
features := ""
if len(sc.ServerFeatures) != 0 {
features = strings.Join(sc.ServerFeatures, "-")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since strings.Join returns "" when given an empty slice, this can be simplified to:

features := strings.Join(sc.ServerFeatures, "-")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

Comment on lines +221 to +222
case (sc != nil) != (other != nil):
return false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: if sc == nil || other == nil is equivalent given the preceding case; I'm not sure which is more readable, but I find this form with 3 !=s to be a little confusing, personally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that sc == nil || other == nil is much easier to read than (sc != nil) != (other != nil) when considered in isolation. But when used here, the former requires carrying over of state from the previous case, because we need to remember that both are not nil to make complete sense of this statement. But the latter is more self-contained as it simply compares the nilness of one with the other.

I think I might have written it as if sc == nil || other == nil if I had written this whole method as a series of if conditions, instead of a case statement.

@dfawley dfawley assigned easwars and unassigned dfawley Mar 14, 2023
Copy link
Contributor Author

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

// Also ensure that a new LRS stream is not created.
sCtx, sCancel = context.WithTimeout(ctx, defaultTestShortTimeout)
defer sCancel()
if _, err := mgmtServer.LRSServer.LRSStreamOpenChan.Receive(ctx); err == nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, thanks.

Comment on lines 159 to 162
features := ""
if len(sc.ServerFeatures) != 0 {
features = strings.Join(sc.ServerFeatures, "-")
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

Comment on lines +221 to +222
case (sc != nil) != (other != nil):
return false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that sc == nil || other == nil is much easier to read than (sc != nil) != (other != nil) when considered in isolation. But when used here, the former requires carrying over of state from the previous case, because we need to remember that both are not nil to make complete sense of this statement. But the latter is more self-contained as it simply compares the nilness of one with the other.

I think I might have written it as if sc == nil || other == nil if I had written this whole method as a series of if conditions, instead of a case statement.

@easwars easwars merged commit 52ca957 into grpc:master Mar 15, 2023
10 checks passed
@easwars easwars deleted the drops_with_federation branch March 15, 2023 01:37
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

xds: clusterimpl LB policy restarts load reports with no change in lrs_server config
2 participants