Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Handle unstructured status update with fake client #2495

Merged

Conversation

scothis
Copy link
Member

@scothis scothis commented Sep 14, 2023

In order to prevent unintentional mutations outside of the status during a status update, the non-status fields are copied back onto the passed object.

This operation now gracefully handles both unstructured and typed objects. Previously, it would panic if an unstructured object was passed for a GVK known to the scheme, as internally the object within the tracker is converted to the typed equivalent. The two types cannot be directly assigned to each other and instead must be copied.

Adds a failing test to demonstrate the bug, and the fix to make the test pass.

Resolves #2493

In order to prevent unintentional mutations outside of the status during
a status update, the non-status fields are copied back onto the passed
object.

This operation now gracefully handles both unstructured and typed
objects. Previously, it would panic if an unstructured object was passed
for a GVK known to the scheme, as internally the object within the
tracker is converted to the typed equivalent. The two types cannot
be directly assigned to each other and instead must be copied.

Signed-off-by: Scott Andrews <andrewssc@vmware.com>
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 14, 2023
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Sep 14, 2023
@scothis
Copy link
Member Author

scothis commented Sep 14, 2023

/retest

looks like a test flake

@troy0820
Copy link
Member

/retest

Did this not get retriggered? Looks like a golangci-lint error. Interesting.

@scothis
Copy link
Member Author

scothis commented Sep 14, 2023

Did this not get retriggered?

It did. the first run's flake was a timeout in the leader election code.

@troy0820
Copy link
Member

/assign @alvaroaleman

Assigning @alvaroaleman as he made the initial change. I'd look to him to give the lgtm. But this looks good to me.

Thanks for the contribution @scothis

@@ -972,6 +974,19 @@ func copyStatusFrom(old, new runtime.Object) error {
return nil
}

// copyFrom copies from old into new
func copyFrom(old, new runtime.Object) error {
oldMapStringAny, err := toMapStringAny(old)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original reason for not doing this anymore is that it might leave stale data in the target. I am confused as to why this isn't a problem anymore. Maybe we just didn't add sufficient test coverage?

Copy link
Member Author

@scothis scothis Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't changing existing behavior. No tests were modified or removed. What it does is handle a mix of typed and unstructured objects gracefully, instead of panicing.

What stale data is left behind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What @alvaroaleman is referring to I believe is that before this function was added, we removed a function similar to this where we were doing a copyFrom because of the issues with that function leaving stale data around.
#2479

#2479 (comment)

If this being added doesn't impact what is already there (covers previous tests and your new test case) and handles the new case to fix the regression, the new test coverage should be sufficient to have this fix be in and resolve the issue with unstructured stuff.

I believe we have unstructured.Unstructured tests cases and if not being caught in testing, it begs the question how did those tests pass, if this problem still presented itself.

IANAM, (I am not a maintainer)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the previous tests and code handled both typed->typed and unstructured->unstructured, but triggered a panic with typed->unstructured. The mix can happen when the GVK is registered in the scheme, so the resource in the tracker is always converted to the typed form, and then the user performs a status update using an unstructured object.

It's a bit esoteric as most users will favor a typed resource when available, but we hit it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we also added code to zero the target, that is why it works. Thanks for the patch and sorry for the review delay

@@ -972,6 +974,19 @@ func copyStatusFrom(old, new runtime.Object) error {
return nil
}

// copyFrom copies from old into new
func copyFrom(old, new runtime.Object) error {
oldMapStringAny, err := toMapStringAny(old)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we also added code to zero the target, that is why it works. Thanks for the patch and sorry for the review delay

Expect(err).NotTo(HaveOccurred())

Expect(cl.Update(context.Background(), obj)).To(Succeed())
Expect(cl.Get(context.Background(), client.ObjectKeyFromObject(obj), obj)).To(Succeed())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you actually mind to do the retrieval and verification using corev1.Pod instead of unstructured.Unstructured? That I believe would simplify the code a bit but more importantly clarify the intention of the testcase. I had to read the title three times to understand it and that is with the context from the PR. Same for the other testcase.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the tests. Sorry for the delay, I was traveling last week.

Using typed objects for the initial and actual object content assertion.
Unstructured objects are only used for the update.

Signed-off-by: Scott Andrews <andrewssc@vmware.com>
@scothis
Copy link
Member Author

scothis commented Oct 2, 2023

/test pull-controller-runtime-test

@alvaroaleman alvaroaleman added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Oct 5, 2023
Copy link
Member

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 5, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, scothis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 5, 2023
@k8s-ci-robot k8s-ci-robot merged commit 067b29f into kubernetes-sigs:main Oct 5, 2023
9 checks passed
@scothis scothis deleted the unstructured-status-update branch October 5, 2023 13:48
@troy0820
Copy link
Member

troy0820 commented Oct 5, 2023

/cherry-pick release-0.16

@k8s-infra-cherrypick-robot

@troy0820: failed to push cherry-picked changes in GitHub: pushToNamedFork is not implemented in the v2 client

In response to this:

/cherry-pick release-0.16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

troy0820 pushed a commit to troy0820/controller-runtime that referenced this pull request Oct 5, 2023
…#2495)

* Handle unstructured status update with fake client

In order to prevent unintentional mutations outside of the status during
a status update, the non-status fields are copied back onto the passed
object.

This operation now gracefully handles both unstructured and typed
objects. Previously, it would panic if an unstructured object was passed
for a GVK known to the scheme, as internally the object within the
tracker is converted to the typed equivalent. The two types cannot
be directly assigned to each other and instead must be copied.

Signed-off-by: Scott Andrews <andrewssc@vmware.com>

* Review feedback

Using typed objects for the initial and actual object content assertion.
Unstructured objects are only used for the update.

Signed-off-by: Scott Andrews <andrewssc@vmware.com>

---------

Signed-off-by: Scott Andrews <andrewssc@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic in fake client when updating status with Unstructured resource
5 participants