-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flaking Test][sig-network] ci-kubernetes-verify-master verify-openapi-spec #121865
Comments
cc @aojea for cidr controller |
/sig auth network |
/remove-sig auth #116516 Is almost certainly related |
/assign it seems the servicecidr controller takes time to start and it blocks the apiserver readiness |
/reopen Test run failed after related PR merger: https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-verify-master/1725068113919086592 |
@Vyom-Yadav: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
yeah |
/triage accepted |
this is sig network @rjsadow can you pelase update the title? |
/retitle [Flaking Test][sig-network] ci-kubernetes-verify-master verify-openapi-spec |
https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-verify-master/1739005684546015232 This still flake in master blocking test grid. |
kubernetes/hack/local-up-cluster.sh Lines 582 to 625 in b9e0714
kubernetes/hack/update-openapi-spec.sh Lines 91 to 97 in b9e0714
My local test with 15s timeout failed several times. 30s is enough for my local tests. Or we just enlarge the timeout for CI.
|
having both scripts the same timeout sounds reasonable I'm still worried why it keeps timing out, but for that we should first define a test that measure the apiserver startup and then decide what value is correct and enforce it. |
/priority important-soon |
Hello! As this issue is tagged for 1.30, is it still planned for this release? |
This flakes in https://testgrid.k8s.io/sig-release-master-blocking#verify-master. Do we have some action items for v1.30? |
I need to fix it, is a shame on me, sorry |
Hello! As this issue is tagged for 1.30, is it still planned for this release? Please let us know the status if possible. |
yeah, I like to get it fixed, there is some race condition that makes some of the boostrap to deadlock , I still didn't find why |
didn't forget but didn't have time to figure out the root cause, will try to get back to this end of the week |
I got one failure locally of an apiserver that took much time
and this keep going for a while
@wojtek-t @liggitt do you know why the resource version is growing? |
looks like the server returns that error in a few places, not sure which one is getting exercised: kubernetes/staging/src/k8s.io/apiserver/pkg/storage/etcd3/watcher.go Lines 362 to 369 in b3926d1
kubernetes/staging/src/k8s.io/apiserver/pkg/storage/cacher/watch_cache.go Lines 473 to 476 in b3926d1
kubernetes/staging/src/k8s.io/apiserver/pkg/storage/etcd3/store.go Lines 1000 to 1003 in b3926d1
update-openapi-spec starts with AllAlpha=true in order to enable all features / APIs ... @wojtek-t @p0lyn0mial @serathius are there any alpha features around watch / resourceVersion that could cause this? |
Didn't read carefully the issue, so just quickly sharing some thoughts:
Is it possible that this value is set? @p0lyn0mial ^^
@serathius ^^ |
Based on the log from #121865 (comment)
I would guess this is the #123674
Planned to fix it, but PR was blocked on #123732 |
thanks a ton for digging into that @aojea, to deflake this test, I'd set this in the update-openapi-spec.sh file (since it doesn't impact the generated openapi) with a comment to remove once #123674 is resolved
|
Which jobs are flaking?
Verify Master - Master Blocking https://testgrid.k8s.io/sig-release-master-blocking#verify-master
Which tests are flaking?
verify.openapi-spec
Since when has it been flaking?
Inconsistently since 2 November. First failure (shown in testgrid) https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-verify-master/1719970042151440384
Testgrid link
https://testgrid.k8s.io/sig-release-master-blocking#verify-master
Reason for failure (if possible)
API Server is not becoming ready during
hack/make-rules/../../hack/verify-openapi-spec.sh
Anything else we need to know?
Relevant logs
Relevant SIG(s)
/sig-testing
The text was updated successfully, but these errors were encountered: