Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(resource): match discovery burst and qps for kubectl with upstream kubectl binary #11603

Merged

Conversation

vadasambar
Copy link
Contributor

May fix or alleviate issues like #11156 and #9934

Motivation

We migrated to using kubectl as a package starting v3.4.7:

Before v3.4.7 we used kubectl binary (ref1, ref2)

kubectl binary uses uses higher values for discovery burst and QPS since v1.24. kubectl passes the new QPS and burst through NewDefaultKubectlCommand while in argo workflows we use NewKubectlCommand instead which uses default burst of 100 (we are using v.0.24.3 version of kubectl pkg at the time of writing this)

If there are 100 or more CRDs in the cluster, discovery can take time. This leaves less time for pod to run because of which pod can exceed activeDeadlineSeconds (check #11156 and #9934). Related: kubernetes/kubectl#1126

Modifications

This PR matches the the new discovery burst and QPS set in the upstream.
Related:

Note that in 1.26 version (v0.26.x for kubectl package), default burst across all client-go clients was bumped up to 300 (no increase in the default QPS though) but we can't use this since we are on v0.24.3. Upgrading the kubectl package is an option but I am not sure about the consequences of doing that. We might want to re-visit this PR again in the future and remove lines which are already set in the upstream.

Verification

@tooptoop4
Copy link
Contributor

300/50 seems too high, I raised argoproj/argo-helm#2195 but 100/50 is only recommended from k8s 1.27, under k8s 1.27 its 20/10

@vadasambar
Copy link
Contributor Author

vadasambar commented Aug 18, 2023

300/50 seems too high, I raised argoproj/argo-helm#2195 but 100/50 is only recommended from k8s 1.27, under k8s 1.27 its 20/10

I think you are talking about rest client burst/qps. This PR is about discovery client burst/qps. 300/50 is the default upstream value since v1.24 version of Kubernetes. You can find the details in the description of the PR.

@vadasambar vadasambar force-pushed the fix/kubectl-discovery-burst-and-qps branch 2 times, most recently from 392b6ad to 5ee93b7 Compare August 22, 2023 11:36
@vadasambar
Copy link
Contributor Author

vadasambar commented Aug 22, 2023

CI / E2E Tests (test-executor, minimal) seems flaky. I see the test is failing in other open PRs as well:

Makes me think the test failure is not related to this PR.

@vadasambar
Copy link
Contributor Author

I think the PR is ready for review. I don't think the test failure is related to the changes in this PR. Happy to take another look if the reviewers think the failure is related to changes in this PR.

@vadasambar vadasambar marked this pull request as ready for review August 22, 2023 12:09
Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be chore instead of fix?

@vadasambar
Copy link
Contributor Author

Should this be chore instead of fix?

I don't understand why this would be a chore.

fix: a commit of the type fix patches a bug in your codebase (this correlates with PATCH in Semantic Versioning).

I have outlined the issues this PR can fix in the description of the PR:

check #11156 and #9934

#11603 (comment)

@vadasambar vadasambar force-pushed the fix/kubectl-discovery-burst-and-qps branch from 5ee93b7 to d9fa588 Compare August 24, 2023 10:32
@vadasambar vadasambar force-pushed the fix/kubectl-discovery-burst-and-qps branch 2 times, most recently from 242a484 to b06ce13 Compare August 25, 2023 12:10
@terrytangyuan terrytangyuan enabled auto-merge (squash) August 25, 2023 15:15
auto-merge was automatically disabled August 29, 2023 15:22

Head branch was pushed to by a user without write access

@vadasambar vadasambar force-pushed the fix/kubectl-discovery-burst-and-qps branch from b06ce13 to 179923b Compare August 29, 2023 15:22
…eam kubectl binary

Signed-off-by: vadasambar <suraj.bankar@acquia.com>

docs: re-word the TODO with more details

Signed-off-by: vadasambar <suraj.bankar@acquia.com>
@vadasambar vadasambar force-pushed the fix/kubectl-discovery-burst-and-qps branch from 179923b to 45693d4 Compare August 30, 2023 11:39
@vadasambar
Copy link
Contributor Author

@terrytangyuan I think I might need approval with auto-merge or a manual merge from you.
image

CI is passing.

@terrytangyuan terrytangyuan merged commit d3cb451 into argoproj:master Aug 30, 2023
23 checks passed
terrytangyuan pushed a commit that referenced this pull request Sep 5, 2023
…eam kubectl binary (#11603)

Signed-off-by: vadasambar <suraj.bankar@acquia.com>
dpadhiar pushed a commit to dpadhiar/argo-workflows that referenced this pull request May 9, 2024
…eam kubectl binary (argoproj#11603)

Signed-off-by: vadasambar <suraj.bankar@acquia.com>
Signed-off-by: Dillen Padhiar <dillen_padhiar@intuit.com>
@agilgur5 agilgur5 changed the title fix(workflow): match discovery burst and qps for kubectl with upstream kubectl binary fix(resource): match discovery burst and qps for kubectl with upstream kubectl binary May 11, 2024
@agilgur5
Copy link
Member

agilgur5 commented May 11, 2024

May fix or alleviate issues like #11156 and #9934

If there are 100 or more CRDs in the cluster, discovery can take time. This leaves less time for pod to run because of which pod can exceed activeDeadlineSeconds (check #11156 and #9934)

This specific change would only help those issues if they were using many resource templates with many CRDs. Those are typically a small subset of Workflows and is not used by #9934's reproduction

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants