Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix creation and deletion of host port maps that would occasionally leave pods without them #37419

Merged
merged 2 commits into from
Feb 26, 2025

Conversation

javanthropus
Copy link
Contributor

Fixes: #32805

@javanthropus javanthropus requested review from a team as code owners February 3, 2025 16:08
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 3, 2025
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Feb 3, 2025
@javanthropus javanthropus force-pushed the pr/fix-host-port-cleanup branch 2 times, most recently from 2e5fb68 to 8c62005 Compare February 5, 2025 15:02
@joamaki
Copy link
Contributor

joamaki commented Feb 5, 2025

Hey, could you add descriptions to the commits that describe what the problem was and how exactly this fixes the issue? It is not clear to me what including the pod UID does here to improve things.

I need to port this fix to the new control-plane (pkg/loadbalancer/experimental) which replaces this code in v1.18, so need to figure out what the root cause is. Do you also have an example scenario of where the problem occurs? Is it in context of stateful sets or is it when the HostPort is handed over from one pod to another?

@javanthropus javanthropus force-pushed the pr/fix-host-port-cleanup branch from 8c62005 to f57ddf0 Compare February 5, 2025 22:07
@javanthropus javanthropus requested a review from joamaki February 5, 2025 22:08
@javanthropus
Copy link
Contributor Author

Hey, could you add descriptions to the commits that describe what the problem was and how exactly this fixes the issue? It is not clear to me what including the pod UID does here to improve things.

Done.

Do you also have an example scenario of where the problem occurs? Is it in context of stateful sets or is it when the HostPort is handed over from one pod to another?

I included a manifest and instructions for reproducing the issue in #32805. In short, this can happen any time the HostPort is handed from one pod to another. In my example, it affects pods of a Deployment, but we also saw it affect DaemonSet pods. I chose to use the pod UID in this PR because it could just as easily affect StatefulSet pods where the pod names don't necessarily change. Hopefully the descriptions in the commits help clarify things.

@joamaki
Copy link
Contributor

joamaki commented Feb 6, 2025

Hey, could you add descriptions to the commits that describe what the problem was and how exactly this fixes the issue? It is not clear to me what including the pod UID does here to improve things.

Done.

Do you also have an example scenario of where the problem occurs? Is it in context of stateful sets or is it when the HostPort is handed over from one pod to another?

I included a manifest and instructions for reproducing the issue in #32805. In short, this can happen any time the HostPort is handed from one pod to another. In my example, it affects pods of a Deployment, but we also saw it affect DaemonSet pods. I chose to use the pod UID in this PR because it could just as easily affect StatefulSet pods where the pod names don't necessarily change. Hopefully the descriptions in the commits help clarify things.

Thanks! This clarifies it. I don't think including the pod UID will do anything since pods are indexed by name everywhere, e.g. it is not possible to have two copies of a pod with the same name in the system.

I'll take a look at reproducing this issue and implement the fix&test for the upcoming control-plane. Will think about if there's a point in including the pod UID and will get back on that...

@javanthropus
Copy link
Contributor Author

I don't think including the pod UID will do anything since pods are indexed by name everywhere, e.g. it is not possible to have two copies of a pod with the same name in the system.

That's true, but the events associated with pod state changes are asynchronous and could arrive or be processed out of order. Via debug logs, I saw this happen and cause the removal of the host port map after a new pod was started to replace an old pod. I added the pod identity check in the first commit to guard against this case. When doing so, it seemed prudent to guard against the case of the old and new pods having the same name as well, such as for StatefulSet managed pods. I'm not aware of any better solution than the pod UID to do that.

Maybe what I should have done though is replace the pod name in the Name field of the ServiceName struct with the pod UID rather than add it. The pod name is redundant the vast majority of the time since the UID is guaranteed to be unique.

@joamaki
Copy link
Contributor

joamaki commented Feb 7, 2025

I don't think including the pod UID will do anything since pods are indexed by name everywhere, e.g. it is not possible to have two copies of a pod with the same name in the system.

That's true, but the events associated with pod state changes are asynchronous and could arrive or be processed out of order. Via debug logs, I saw this happen and cause the removal of the host port map after a new pod was started to replace an old pod. I added the pod identity check in the first commit to guard against this case. When doing so, it seemed prudent to guard against the case of the old and new pods having the same name as well, such as for StatefulSet managed pods. I'm not aware of any better solution than the pod UID to do that.

The name/namespace is how pods are indexed in Kubernetes. It is not possible to have two instances of the pod with the same name and different UID at the same time and since the events will arrive in order for a resource with a specific name, Cilium will not see things out of order, e.g. it will see the pod terminating, being deleted and then recreated. What can happen is that the deletion and recreation gets coalesced and cilium-agent will only see an "update" to the pod where the UID changes. And that is something that needs to be handled correctly (for example in the endpoint sub-system). However that is not going to affect HostPort handling. Adding the UID in the service name for HostPort doesn't really do anything except make the service name longer. However, I don't really have anything against this and I can see it being useful in debugging scenarios, so let's go for it.

The fix seems to be correct, however it's quite subtle and relies on the fact that a pod being updated to phase=Completed is considered a deletion.

Maybe what I should have done though is replace the pod name in the Name field of the ServiceName struct with the pod UID rather than add it. The pod name is redundant the vast majority of the time since the UID is guaranteed to be unique.

Let's keep the name as well since it's very informative.

@joamaki joamaki added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Feb 7, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 7, 2025
@joamaki joamaki added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Feb 7, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Feb 7, 2025

Verified

This commit was signed with the committer’s verified signature. The key has expired.
dtzWill Will Dietz
Because pod events are asynchronous, sometimes the event that would
trigger the removal of a host port map was seen to arrive after a new
pod with the same host port on the same node was started.  If the new
pod had the same IP address as the old one (not uncommon for DaemonSet
pdos), the old code performed the removal of the host port map that the
new pod was using.

The pod's UID is now included in the name of the load balancer service
associated with the host port map and checked against the pod associated
with the cleanup event.  The UID was chosen as the only identifier that
is unique regardless of how the pod is created since the pod's name
could be the same in some cases.

Signed-off-by: Jeremy Bopp <jeremy@bopp.net>

Verified

This commit was signed with the committer’s verified signature. The key has expired.
dtzWill Will Dietz
In some cases a pod update event would be received where the old pod
status did not include a host IP yet.  A check to compare old and new
pod details was incorrectly comparing host IPs in order to decide that
the update could be ignored.  The check also did not consider that the
pod IP addresses may have changed.  In both cases a host port map for
the pod would not be created.

Signed-off-by: Jeremy Bopp <jeremy@bopp.net>
@javanthropus javanthropus force-pushed the pr/fix-host-port-cleanup branch from f57ddf0 to 96856d6 Compare February 18, 2025 23:04
@javanthropus javanthropus requested a review from joamaki February 18, 2025 23:40
@javanthropus
Copy link
Contributor Author

@joamaki, I think I jammed this by requesting a re-review from you after pushing my last set of updates. Can you approve this again?

How much longer before this is merged?

@joamaki
Copy link
Contributor

joamaki commented Feb 25, 2025

/test

@javanthropus
Copy link
Contributor Author

I can't tell if the failing tests are a result of my changes, but I don't think they are.

@aanm aanm enabled auto-merge February 26, 2025 16:13
@aanm aanm added this pull request to the merge queue Feb 26, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 26, 2025
Merged via the queue into cilium:main with commit 5565d84 Feb 26, 2025
63 of 65 checks passed
@nbusseneau nbusseneau mentioned this pull request Feb 27, 2025
5 tasks
@nbusseneau nbusseneau added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Feb 27, 2025
@nbusseneau nbusseneau mentioned this pull request Feb 27, 2025
6 tasks
@nbusseneau nbusseneau added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Feb 27, 2025
@nbusseneau nbusseneau mentioned this pull request Feb 27, 2025
17 tasks
@nbusseneau nbusseneau added backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. and removed needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Feb 27, 2025
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. labels Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cilium sporadically loses track of HostPort associated with daemonset pods
5 participants