New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't use retryStrategy AND hooks on a step that outputs an artifact with downstream consumers #12109
Comments
I should explain that the python script is trying to randomly take too long and fail with the timeout about 50% of the time. I inadvertently left the commented out system.fail() in there, sorry. |
I think it might be related to this other issue that was recently fixed, not sure. |
It also fails to resolve the artifact with a regular exit handler (hook named exit) only. |
I was able to reproduce this issue using the latest version of Argo Workflows (v3.5.4), which includes #12192. Posting a few examples for clarity on the issue that we're experiencing. Outputs are properly recognized for steps that use EITHER lifecycle hooks, or retryStrategy: metadata:
generateName: artifact-hooks-
spec:
entrypoint: all-steps
templates:
- name: all-steps
steps:
- - name: step-one
inline:
container:
name: main
image: argoproj/argosay:v2
command:
- /bin/bash
- "-c"
args:
- "touch /output.txt"
outputs:
artifacts:
- name: output-file
path: /output.txt
hooks:
success:
expression: 'steps["step-one"].status == "Succeeded"'
template: exit-zero
- - name: step-two
inline:
inputs:
artifacts:
- name: input-file
path: /input.txt
container:
name: main
image: argoproj/argosay:v2
command:
- /bin/bash
- "-c"
args:
- "exit 0"
arguments:
artifacts:
- name: input-file
from: "{{steps.step-one.outputs.artifacts.output-file}}"
- name: exit-zero
container:
name: main
image: argoproj/argosay:v2
command:
- /bin/bash
- "-c"
args:
- "exit 0" metadata:
generateName: artifact-retries-
spec:
entrypoint: all-steps
templates:
- name: all-steps
steps:
- - name: step-one
inline:
retryStrategy:
limit: "3"
container:
name: main
image: argoproj/argosay:v2
command:
- /bin/bash
- "-c"
args:
- "touch /output.txt"
outputs:
artifacts:
- name: output-file
path: /output.txt
- - name: step-two
arguments:
artifacts:
- name: input-file
from: "{{steps.step-one.outputs.artifacts.output-file}}"
inline:
inputs:
artifacts:
- name: input-file
path: /input.txt
container:
name: main
image: argoproj/argosay:v2
command:
- /bin/bash
- "-c"
args:
- "exit 0" But if the step with outputs uses retries AND hooks, then the step after it does not recognize the output of the first step: metadata:
generateName: artifact-retries-hooks-
spec:
entrypoint: all-steps
templates:
- name: all-steps
steps:
- - name: step-one
inline:
retryStrategy:
limit: "3"
container:
name: main
image: argoproj/argosay:v2
command:
- /bin/bash
- "-c"
args:
- "touch /output.txt"
outputs:
artifacts:
- name: output-file
path: /output.txt
hooks:
success:
expression: 'steps["step-one"].status == "Succeeded"'
template: exit-zero
- - name: step-two
inline:
inputs:
artifacts:
- name: input-file
path: /input.txt
container:
name: main
image: argoproj/argosay:v2
command:
- /bin/bash
- "-c"
args:
- "exit 0"
arguments:
artifacts:
- name: input-file
from: "{{steps.step-one.outputs.artifacts.output-file}}"
- name: exit-zero
container:
name: main
image: argoproj/argosay:v2
command:
- /bin/bash
- "-c"
args:
- "exit 0" The failed node includes the following message:
|
…2109 Signed-off-by: shuangkun <tsk2013uestc@163.com>
It should be that the hook node was selected probabilistically when building LocalScope. |
…2109 Signed-off-by: shuangkun <tsk2013uestc@163.com>
…2109 Signed-off-by: shuangkun <tsk2013uestc@163.com>
Pre-requisites
:latest
What happened/what you expected to happen?
I would expect for me to be able to use both hooks and retries on the same step. When I use one or the other, it works. But together, it messes up the resolution of artifacts for my downstream consumers. The error shows on the workflow as
unable to resolve references: Unable to resolve: "steps.build.outputs.artifacts.result"
. Additionally, the build step shows as Running but all the pods have completed or errored.Version
v3.5.0
Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
Logs from the workflow controller
Logs from in your workflow's wait container
The text was updated successfully, but these errors were encountered: