Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Fix for the SDS update failure (#615)" as no longer needed on top of #559 #657

Merged
merged 2 commits into from Mar 20, 2023

Conversation

valerian-roche
Copy link
Contributor

As discussed in this issue, the fix done in #615 is creating issues:

  • it is no longer applicable on top of delta: [#558] Support dynamic wildcard subscription in delta-xds #559 which already makes sure subscription changes are properly reflected without modifying the VersionMap
  • it is invalid in the context of new resources being subscribed to or being removed, leading to further issues in this context (e.g. a resource deletion will never be notified if not originally watched)

#615 did not include any test so it is hard to know if we are properly covering the use case, but from discussions we should already be. If it is not the case (e.g. if in sotw), a fix alike #559 will likely be required

@@ -119,26 +119,8 @@ func (s *server) processDelta(str stream.DeltaStream, reqCh <-chan *discovery.De

watch := watches.deltaWatches[typ]
watch.nonce = nonce
// As per XDS protocol, for the non wildcard resources, management server should only respond to the resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Query]
I understand that this fix is not required as we are maintaining the separate list of subscribed resources. However, after removing the below change the race condition is still applicable and will result in removing the entry from the resource versionMap, which was already notified to the client ?.

Now during the next setSnapshot, when we are creating the delta response, the below implementation still relies on the GetResourceVersions() which is invalid(resource which was successfully notified to the client was overwritten) due to race condition. Since GetResourceVersions() is incorrect, we will unnecessarily be sending the resource to the client again even though the resource was already notified, and there was no change in the hash value.

I agree that notifying the resource again is not harmful, but might not be an efficient approach with large configuration. IMO we should address the race condition so that resources which are already notified remains intact in the resourceVersionMap.

@valerian-roche @haoruolei Please correct me if I am wrong in my findings. Thanks.

nextVersionMap = make(map[string]string, len(state.GetSubscribedResourceNames()))
// state.GetResourceVersions() may include resources no longer subscribed
// In the current code this gets silently cleaned when updating the version map
for name := range state.GetSubscribedResourceNames() {
prevVersion, found := state.GetResourceVersions()[name]
if r, ok := resources.resourceMap[name]; ok {
nextVersion := resources.versionMap[name]
if prevVersion != nextVersion {
filtered = append(filtered, r)
}
nextVersionMap[name] = nextVersion
} else if found {
toRemove = append(toRemove, name)
}
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi

Can you provide a test case for the aforementioned race condition?
It is also not clear to me where you expect this race to occur? Is it that a new request is handled prior to the first response, and its response might be created prior to the previous one being returned?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@valerian-roche
Yes, the race condition was because the new request from client is handled prior to the first response, as a result the resourceVersionMap was not having the correct subscription list (as resourceVersionMap was being used to maintain the subscription list).

As you are maintaining the separate map for the subscription list, the above race condition will not overwrite or corrupt the subscription list, but the race condition still exist and will corrupt/overwrite the hash of the resource which is already notified to the client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide a test showing the issue?
From my understanding, you have:

  • a first request subscribes to resource A. At this stage the subscribed list is set to [A], and the version map is empty
  • a response for this request is being built and has been queued for replying, but at this stage a second request is processed, adding a subscription for B. Now the subscribed list is [A, B] and the version map is still empty
  • the response for the first request is sent. You get an update for A. The version map is now [A]
  • the response for the second request is sent. You get an update for A and B and the version map is now [A, B]. The response for the second request was processed prior to the version map being updated for the first request, so A is notified twice when it was not needed
    I am not fully clear what you mean by "corrupted" here, as the version map should be good in the end. The xDS protocol clearly states that the control-plane can send updates of resources which have not changed or have not been explicitly subscribed to: in https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol#how-the-client-specifies-what-resources-to-return:
Normally (see below for exceptions), requests must specify the set of resource names that the client is interested in. The management server must supply the requested resources if they exist. The client will silently ignore any supplied resources that were not explicitly requested. When the client sends a new request that changes the set of resources being requested, the server must resend any newly requested resources, even if it previously sent those resources without having been asked for them and the resources have not changed since that time. If the list of resource names becomes empty, that means that the client is no longer interested in any resources of the specified type.

The opposite case where we would not be sending an update for a resource that was removed then readded is a big part of why the logic is like this, as this will eventually converge to the latest version being sent for all requested resources. The change in #615 is potentially creating this case, as the version map no longer reflects what is the last objects returned (an object removed then re-added while still at the same version will not be sent). This is highly problematic in the case of delta-eds and cds updates that can create this specific setup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that envoy protocol is eventual consistency. The data will not be corrupted with a separate subscription list and resourceversion map. The logic above always compares the resource in the cache with the resource map. So even if responses are not in the order of requests, in the end, the newest version will be recorded in the version map and sent to the client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the current code ensures that the latest response will be run against the latest subscription set and with the version map latest set by a response. In a case where envoy would send very quickly subscribe to something, then remove, then re-subscribe, there is a chance that we would not send again the resources if all three are processed before a response is first sent. This can happen due to the queueing at https://github.com/envoyproxy/go-control-plane/blob/main/pkg/server/delta/v3/server.go#L198 not guaranteeing how many responses can be pending prior to the select running, though realistically I would not expect this case to occur with more than one pending response in the general case (as the queueing time would have to be longer than the request processing time).
A known remaining issue is when a request reports a NACK. I have another PR to properly identify what has been returned and what has been acknowledged (which can be very different in some cases), but it is pending on the rework of the Cache interface

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on the sub, remove, re-sub case above. I'm not sure I follow what could go wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that the VersionMap used to build a response is set when the last response was sent.

  • First request subscribing to A and B comes in and gets a response. Subscriptions are [A, B] and version map is [A, B]
  • Second request unsubscribing to B comes in. Subscriptions are A, and this will build a response for A. The response for A ends in the dequeue/requeue goroutine
  • Another request came is resubscribing to B. Subscriptions are [A, B], and the response is built with the version map still set as [A, B]. There is no change so the watch gets created
  • The response for the second request is sent, and now the version map is [A] and the subscriptions are [A, B] with the watch waiting for any cache change. B is still at the same version as when initially sent (otherwise it would be resent and we're good), but B was not sent again, which violates the xds protocol. That might be addressable by forcing the version to "" as done in https://github.com/envoyproxy/go-control-plane/blob/main/pkg/server/delta/v3/server.go#L265 outside of the wildcard case, but that will likely require more work

To be clear this will only happen if envoy sends both requests in a very short timeframe (~ms), compared to the current state after 615 when it will occur every time, which is fully breaking delta-eds

Overall I'm not really willing to change too much prior to #584 and #586 which are explicitly designed to fix this case and require far more logic update

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. This seems a rare scenario but looking forward to the future fix work! BTW do you mean that setting version to "" when subscribing to a new resource?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide a test showing the issue? From my understanding, you have:

  • a first request subscribes to resource A. At this stage the subscribed list is set to [A], and the version map is empty
  • a response for this request is being built and has been queued for replying, but at this stage a second request is processed, adding a subscription for B. Now the subscribed list is [A, B] and the version map is still empty
  • the response for the first request is sent. You get an update for A. The version map is now [A]
  • the response for the second request is sent. You get an update for A and B and the version map is now [A, B]. The response for the second request was processed prior to the version map being updated for the first request, so A is notified twice when it was not needed
    I am not fully clear what you mean by "corrupted" here, as the version map should be good in the end. The xDS protocol clearly states that the control-plane can send updates of resources which have not changed or have not been explicitly subscribed to: in https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol#how-the-client-specifies-what-resources-to-return:
Normally (see below for exceptions), requests must specify the set of resource names that the client is interested in. The management server must supply the requested resources if they exist. The client will silently ignore any supplied resources that were not explicitly requested. When the client sends a new request that changes the set of resources being requested, the server must resend any newly requested resources, even if it previously sent those resources without having been asked for them and the resources have not changed since that time. If the list of resource names becomes empty, that means that the client is no longer interested in any resources of the specified type.

The opposite case where we would not be sending an update for a resource that was removed then readded is a big part of why the logic is like this, as this will eventually converge to the latest version being sent for all requested resources. The change in #615 is potentially creating this case, as the version map no longer reflects what is the last objects returned (an object removed then re-added while still at the same version will not be sent). This is highly problematic in the case of delta-eds and cds updates that can create this specific setup

@valerian-roche You are correct, with your new changes of separate subscription list, the version Map at the end will be correct. It is just that client sending requests in very short span would result in resending the resource, and as you said that it should be OK to resend. Infact resending the same object again was the behaviour with even #615 as well. I think we should be good to remove the changes done as part of #615.

…mping in binaries

Signed-off-by: Valerian Roche <valerian.roche@datadoghq.com>
This reverts commit 0e0f25d.

Signed-off-by: Valerian Roche <valerian.roche@datadoghq.com>
@haoruolei
Copy link

BTW do you happen to know when the next go-control-plane release is planned for? I assume this revert will be included.

@valerian-roche
Copy link
Contributor Author

BTW do you happen to know when the next go-control-plane release is planned for? I assume this revert will be included.

@alecholmez is cutting releases, so I would defer to him on this one

Copy link
Contributor

@alecholmez alecholmez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with merging this. Just caught up on the PR talk.

Also we don't have a release schedule per say but I hope to accomplish the multi-module release soon.

@haoruolei
Copy link

I'm good with merging this. Just caught up on the PR talk.

Also we don't have a release schedule per say but I hope to accomplish the multi-module release soon.

Thanks for the info. Hopefully this can be released soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants