clusterresolver: push empty config to child policy upon removal of cluster resource #6125

easwars · 2023-03-16T20:35:57Z

When the cluster resource associated with a CDS LB policy is removed, the CDS LB policy propagates this error to its child policy (clusterresolver LB), which stops the associated EDS watch. It was not sending a config update to its child (priority LB) with no endpoints. This meant that there was no picker update from the leaf of the LB policy tree, and therefore the subConns associated with backends in the cluster were still active, and RPCs were successful to the deleted cluster.

This PR fixes this by ensuring that the clusterresolver LB policy sends an empty config update to it child when the cluster resource is removed. This ensures that the child policies are cleaned up, and subConns are removed, and thereby RPCs start to fail to the removed cluster.

Also, fixes #6083, because it replaces the flaky test with an e2e style test.

RELEASE NOTES:

clusterresolver: push empty config to child policy upon removal of cluster resource

xds/internal/balancer/cdsbalancer/tests/balancer_test.go

…uster resource

arvindbr8 · 2023-03-17T20:39:48Z

xds/internal/balancer/clusterresolver/e2e_test/balancer_test.go

+	start := time.Now()
+	end := start.Add(time.Second)


NIT: maybe merge these lines to
end := time.Now().Add(time.Second)

arvindbr8 · 2023-03-17T20:48:54Z

xds/internal/balancer/clusterresolver/e2e_test/balancer_test.go

+	}
+
+	// Ensure that the EDS watch is not canceled.
+	sCtx, sCancel := context.WithTimeout(ctx, defaultTestShortTimeout)


is defaultTestShortTimeout (10ms) long enough of a wait here? How about we move this into the RPC for loop check below and error if we receive anything on the edsResourceCanceledCh channel in that 1s ?

that way we maybe able to remove sCtx created here.

dfawley · 2023-03-17T23:18:32Z

xds/internal/balancer/clusterresolver/resource_resolver.go

+	select {
+	case <-rr.updateChannel:
+	default:
+	}


This makes me wonder whether it's possible for this to race with another source? If not, maybe a comment about this?

Thanks @dfawley. Looks like there is a small race possible in the eds resource resolver. I will fix that ping the PR again.

Done. PTAL. Thanks.

arvindbr8 · 2023-03-21T18:13:13Z

LGTM

dfawley

Cool, glad my comment caught a potential problem!

easwars requested review from dfawley and arvindbr8 March 16, 2023 20:36

easwars added the Type: Bug label Mar 16, 2023

easwars added this to the 1.54 Release milestone Mar 16, 2023

arvindbr8 requested changes Mar 16, 2023

View reviewed changes

xds/internal/balancer/cdsbalancer/tests/balancer_test.go Outdated Show resolved Hide resolved

xds/internal/balancer/cdsbalancer/tests/balancer_test.go Outdated Show resolved Hide resolved

clusterresolver: push empty config to child policy upon removal of cl…

e869163

…uster resource

easwars force-pushed the remove_cluster_fix branch from b286c3c to e869163 Compare March 16, 2023 22:03

easwars modified the milestones: 1.54 Release, 1.55 Release Mar 17, 2023

drain update channel before sending on it from stop()

469ae05

arvindbr8 requested changes Mar 17, 2023

View reviewed changes

easwars added 2 commits March 17, 2023 13:56

review comments: test improvements

565cfd8

make vet happy: remove unused code

4a86564

easwars assigned dfawley and arvindbr8 Mar 17, 2023

arvindbr8 approved these changes Mar 17, 2023

View reviewed changes

arvindbr8 removed their assignment Mar 17, 2023

dfawley approved these changes Mar 17, 2023

View reviewed changes

dfawley assigned easwars and unassigned dfawley Mar 17, 2023

handle a possible race around stop

9bd8cde

easwars assigned dfawley and arvindbr8 and unassigned easwars Mar 21, 2023

arvindbr8 removed their assignment Mar 21, 2023

dfawley approved these changes Mar 21, 2023

View reviewed changes

dfawley assigned easwars and unassigned dfawley Mar 21, 2023

easwars merged commit cdab8ae into grpc:master Mar 21, 2023

easwars deleted the remove_cluster_fix branch March 21, 2023 22:37

github-actions bot locked as resolved and limited conversation to collaborators Sep 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clusterresolver: push empty config to child policy upon removal of cluster resource #6125

clusterresolver: push empty config to child policy upon removal of cluster resource #6125

easwars commented Mar 16, 2023 •

edited

Loading

arvindbr8 Mar 17, 2023

easwars Mar 17, 2023

arvindbr8 Mar 17, 2023

arvindbr8 Mar 17, 2023

easwars Mar 17, 2023

dfawley Mar 17, 2023

easwars Mar 20, 2023

easwars Mar 21, 2023

arvindbr8 commented Mar 21, 2023

dfawley left a comment

clusterresolver: push empty config to child policy upon removal of cluster resource #6125

clusterresolver: push empty config to child policy upon removal of cluster resource #6125

Conversation

easwars commented Mar 16, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arvindbr8 commented Mar 21, 2023

dfawley left a comment

Choose a reason for hiding this comment

easwars commented Mar 16, 2023 •

edited

Loading