Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestStoppedWorkflow fail sometimes (flakey test) #12836

Closed
4 tasks done
shuangkun opened this issue Mar 22, 2024 · 7 comments · Fixed by #12831
Closed
4 tasks done

TestStoppedWorkflow fail sometimes (flakey test) #12836

shuangkun opened this issue Mar 22, 2024 · 7 comments · Fixed by #12831
Assignees
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc area/build Build or GithubAction/CI issues P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important type/bug type/tech-debt

Comments

@shuangkun
Copy link
Member

shuangkun commented Mar 22, 2024

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issue exists when I tested with :latest
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what did you expect to happen?

TestStoppedWorkflow failed sometimes when submit pr. I alse meet some times.
https://github.com/argoproj/argo-workflows/actions/runs/8374128100/job/22928794428?pr=12780
https://github.com/argoproj/argo-workflows/actions/runs/8387303342/job/22969335701?pr=12833
https://github.com/argoproj/argo-workflows/actions/runs/8400671174/job/23016526101?pr=12838
https://github.com/argoproj/argo-workflows/actions/runs/8431760172/job/23089785822?pr=12817
https://github.com/argoproj/argo-workflows/actions/runs/8431760172/job/23089787055?pr=12817
https://github.com/argoproj/argo-workflows/actions/runs/8475112686/job/23222655428?pr=12854
https://github.com/argoproj/argo-workflows/actions/runs/8478188887/job/23230185040?pr=12809
https://github.com/argoproj/argo-workflows/actions/runs/8478402751/job/23230688385?pr=12809

I want

The test always pass.

Version

latest

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

no need

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded
@shuangkun shuangkun self-assigned this Mar 22, 2024
@shuangkun shuangkun changed the title TestStoppedWorkflow failed sometimes TestStoppedWorkflow fail sometimes Mar 22, 2024
@shuangkun shuangkun added the area/artifacts S3/GCP/OSS/Git/HDFS etc label Mar 22, 2024
@agilgur5 agilgur5 added the area/build Build or GithubAction/CI issues label Mar 22, 2024
@agilgur5
Copy link
Member

https://github.com/argoproj/argo-workflows/actions/runs/8374128100/job/22928794428?pr=12780
https://github.com/argoproj/argo-workflows/actions/runs/8387303342/job/22969335701?pr=12833
https://github.com/argoproj/argo-workflows/actions/runs/8400671174/job/23016526101?pr=12838

Could you put the content of the failure log in these issues? GitHub only keeps the logs for a certain time period, so a permanent log of the failure is helpful for debugging and historical purposes

@shuangkun
Copy link
Member Author

logs_21990719476.zip

@shuangkun
Copy link
Member Author

https://github.com/argoproj/argo-workflows/actions/runs/8374128100/job/22928794428?pr=12780
https://github.com/argoproj/argo-workflows/actions/runs/8387303342/job/22969335701?pr=12833
https://github.com/argoproj/argo-workflows/actions/runs/8400671174/job/23016526101?pr=12838

Could you put the content of the failure log in these issues? GitHub only keeps the logs for a certain time period, so a permanent log of the failure is helpful for debugging and historical purposes

upload it.

@agilgur5
Copy link
Member

logs_21990719476.zip

I didn't mean the entire log, just the single test failure. The rest of the logs aren't necessarily actionable.
See #9027 (comment) and #10307 (comment) as examples

@shuangkun
Copy link
Member Author

 ● artgc-dag-wf-stopped-pod-gc-on-pod-completion-hc2s4   Workflow 0s      
 └ ✖ artgc-dag-wf-stopped-pod-gc-on-pod-completion-hc2s4 DAG      22s     
 └ ✔ create-artifact-1                                   Pod      9s      
 └ ✖ create-artifact-2                                   Pod      10s     workflow shutdown with strategy:  Stop
 └ ✔ delay-stop-workflow                                 Pod      6s      
 └ ✖ stop-workflow                                       Pod      2s      workflow shutdown with strategy:  Stop

 ● artgc-dag-wf-stopped-pod-gc-on-pod-completion-hc2s4   Workflow 0s      
 └ ✖ artgc-dag-wf-stopped-pod-gc-on-pod-completion-hc2s4 DAG      22s     
 └ ✔ create-artifact-1                                   Pod      9s      
 └ ✖ create-artifact-2                                   Pod      10s     workflow shutdown with strategy:  Stop
 └ ✔ delay-stop-workflow                                 Pod      6s      
 └ ✖ stop-workflow                                       Pod      2s      workflow shutdown with strategy:  Stop

Condition "for artifacts to exist" met after 22s
Deleting artgc-dag-wf-stopped-pod-gc-on-pod-completion-hc2s4
Waiting 1m30s for workflows {{ } workflows.argoproj.io/test metadata.name=artgc-dag-wf-stopped-pod-gc-on-pod-completion-hc2s4 false false   <nil> 0 }
    when.go:356: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
    artifacts_test.go:214: 
        	Error Trace:	/home/runner/work/argo-workflows/argo-workflows/test/e2e/artifacts_test.go:214
        	            				/home/runner/work/argo-workflows/argo-workflows/test/e2e/fixtures/then.go:251
        	            				/home/runner/work/argo-workflows/argo-workflows/test/e2e/artifacts_test.go:213
        	Error:      	Expected value not to be nil.
        	Test:       	TestArtifactsSuite/TestStoppedWorkflow
    artifacts_test.go:217: 
        	Error Trace:	/home/runner/work/argo-workflows/argo-workflows/test/e2e/artifacts_test.go:217
        	            				/home/runner/work/argo-workflows/argo-workflows/test/e2e/fixtures/then.go:251
        	            				/home/runner/work/argo-workflows/argo-workflows/test/e2e/artifacts_test.go:216
        	Error:      	Expected value not to be nil.
        	Test:       	TestArtifactsSuite/TestStoppedWorkflow
=== FAIL: ArtifactsSuite/TestStoppedWorkflow

@agilgur5 agilgur5 changed the title TestStoppedWorkflow fail sometimes TestStoppedWorkflow fail sometimes (flakey test) Mar 24, 2024
@agilgur5 agilgur5 added the P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important label Mar 24, 2024
@blkperl
Copy link
Contributor

blkperl commented Mar 30, 2024

hey @Garett-MacGowan would you mind taking a look at the test failure and the proposed fix? It looks like the test was added in #11947

Proposed fix: #12831

@Garett-MacGowan
Copy link
Contributor

hey @Garett-MacGowan would you mind taking a look at the test failure and the proposed fixed? It looks like the test was added in #11947

Proposed fix: #12831

Will do

juliev0 pushed a commit that referenced this issue Apr 2, 2024
…2831)

Signed-off-by: shuangkun <tsk2013uestc@163.com>
@agilgur5 agilgur5 added this to the v3.5.x patches milestone Apr 19, 2024
agilgur5 pushed a commit that referenced this issue Apr 19, 2024
…2831)

Signed-off-by: shuangkun <tsk2013uestc@163.com>
(cherry picked from commit fb6c3d0)
isubasinghe pushed a commit to isubasinghe/argo-workflows that referenced this issue May 6, 2024
isubasinghe pushed a commit to isubasinghe/argo-workflows that referenced this issue May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc area/build Build or GithubAction/CI issues P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important type/bug type/tech-debt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants