Make the kubeflow-m2m-oidc-configurator a CronJob #2667

kromanow94 · 2024-04-04T15:43:39Z

Which issue is resolved by this Pull Request:
Resolves #2646

Description of your changes:
Changing the Job to CronJob improves the robustness of the setup in case if the JWKS will change or the user accidentally overwrote the requestauthentication.

Checklist:

Tested on kind and on vcluster.

google-oss-prow · 2024-04-04T15:43:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kromanow94

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~common/oidc-client/oauth2-proxy/OWNERS~~ [kromanow94]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kromanow94 · 2024-04-04T16:05:44Z

@juliusvonkohout or @kimwnasptd can we restart the tests? Both of them failed because of a non-related issue:

timed out waiting for the condition on pods/kubeflow-m2m-oidc-configurator-28537425-s8kzm
timed out waiting for the condition on pods/activator-bd5fdc585-rrnqf
timed out waiting for the condition on pods/autoscaler-5655dd9df5-4knpj
timed out waiting for the condition on pods/controller-5447f77dc5-ljx5r
timed out waiting for the condition on pods/domain-mapping-757799d898-knf69
timed out waiting for the condition on pods/domainmapping-webhook-5d875ccb7d-z2qjv
timed out waiting for the condition on pods/net-istio-controller-5f89595bcb-dv7h2
timed out waiting for the condition on pods/net-istio-webhook-dc448cfc4-rws5f
timed out waiting for the condition on pods/webhook-578c5cf66f-25sf9
timed out waiting for the condition on pods/coredns-5dd5756b68-hpg77
timed out waiting for the condition on pods/coredns-5dd5756b68-vv66m
timed out waiting for the condition on pods/etcd-kind-control-plane
timed out waiting for the condition on pods/kindnet-9l886
timed out waiting for the condition on pods/kindnet-pftsz
timed out waiting for the condition on pods/kindnet-z5qpl
timed out waiting for the condition on pods/kube-apiserver-kind-control-plane
timed out waiting for the condition on pods/kube-controller-manager-kind-control-plane
timed out waiting for the condition on pods/kube-proxy-64vj7
timed out waiting for the condition on pods/kube-proxy-vk4lr
timed out waiting for the condition on pods/kube-proxy-xwm8d
timed out waiting for the condition on pods/kube-scheduler-kind-control-plane
timed out waiting for the condition on pods/local-path-provisioner-7577fdbbfb-7zv5k
timed out waiting for the condition on pods/oauth2-proxy-86d8c97455-hvjl8
timed out waiting for the condition on pods/oauth2-proxy-86d8c97455-z9vjw
Error: Process completed with exit code 1.

juliusvonkohout · 2024-04-08T05:10:47Z

@KRomanov, i restarted the tests. If they fail again we might have to increase the timeouts in this PR.

Signed-off-by: Krzysztof Romanowski <krzysztof.romanowski.kr3@roche.com>

kromanow94 · 2024-04-11T14:26:05Z

@juliusvonkohout this is super weird. I limited the CronJob with concurrencyPolicy: Forbid. I don't know if this should be handled with increasing the timeout or by increaseing the resources for the CICD Jobs... I can also try to split the installation steps to limit how many pods are created at the same time...

juliusvonkohout · 2024-04-15T06:54:51Z

I restarted the tests. yeah our CICD is a bit problematic at the moment. If we can specify more resources in this public repository yes, otherwise we have to increase the timeouts. https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories

kromanow94 · 2024-04-15T09:53:49Z

@juliusvonkohout maybe the issue is with CICD resource sharing? If the memory and cpu is shared between multiple workflows, it may be problematic. I see one of the failing tests completed with success. Can you restart the last test workflow?

Also, is this something I could do myself, for example with the github bot with commands in comment?

juliusvonkohout · 2024-04-15T13:33:39Z

...nts/configure-self-signed-kubernetes-oidc-issuer/cronjob.kubeflow-m2m-oidc-configurator.yaml

+          restartPolicy: OnFailure
+          serviceAccountName: kubeflow-m2m-oidc-configurator
+          containers:
+            - image: curlimages/curl


is this from docker.io?

Probably we should specify it then with docker.io/curlimages/curl

juliusvonkohout · 2024-04-15T13:34:48Z

...nts/configure-self-signed-kubernetes-oidc-issuer/cronjob.kubeflow-m2m-oidc-configurator.yaml

+  name: kubeflow-m2m-oidc-configurator
+  namespace: istio-system
+spec:
+  schedule: '* * * * *'


SHould we not go with every 5 minutes instead of every minute?

I can change to every 5 minutes. There is also configuration for not adding more jobs until the last one is completed and from the latest log from cicd workflows shows that there is no more than 1 job created at a time.

juliusvonkohout · 2024-04-15T13:50:40Z

...nts/configure-self-signed-kubernetes-oidc-issuer/cronjob.kubeflow-m2m-oidc-configurator.yaml

+                defaultMode: 0777
+                items:
+                  - key: script.sh
+                    path: script.sh


are you sure that script.sh is idempotent?

Huh, well it doesn't verify if the JWKS is present and after all is always performing the patch so this might be an improvement. I think the JWKS value should be also compared and only patched if different.

juliusvonkohout · 2024-04-15T13:53:47Z

@juliusvonkohout maybe the issue is with CICD resource sharing? If the memory and cpu is shared between multiple workflows, it may be problematic. I see one of the failing tests completed with success. Can you restart the last test workflow?

Also, is this something I could do myself, for example with the github bot with commands in comment?

I did restart and it failed again. In the KFP repository that was possible with /retest or /retest-failed or so. Probably something i can investigate in the next weeks when i am less busy.

kromanow94 · 2024-04-15T15:14:47Z

@juliusvonkohout maybe we could add verbosity to the logs in CICD GH Workflows? We currently know that the pods aren't ready but what is the actual reason? DockerHub pull rate limits? Not enough resources? Failing Pod?

juliusvonkohout · 2024-04-16T08:41:27Z

@juliusvonkohout maybe we could add verbosity to the logs in CICD GH Workflows? We currently know that the pods aren't ready but what is the actual reason? DockerHub pull rate limits? Not enough resources? Failing Pod?

Yes, lets do that in a separate PR with @codablock as well.

juliusvonkohout · 2024-04-30T16:04:49Z

The tests in #2696 were successful so i reran the test and hope that the CICD is happy now. If not please rebase the PR against the master branch.

juliusvonkohout · 2024-04-30T16:06:09Z

https://github.com/kubeflow/manifests/actions/runs/8891109875 here is the successful test.

juliusvonkohout · 2024-04-30T16:36:42Z

So we need a rebase and step by step debugging with minimal changes.

juliusvonkohout · 2024-04-30T16:48:27Z

/hold

juliusvonkohout · 2024-05-13T07:27:08Z

/retest

google-oss-prow bot added the size/M label Apr 4, 2024

google-oss-prow bot requested a review from juliusvonkohout April 4, 2024 15:43

google-oss-prow bot requested a review from kimwnasptd April 4, 2024 15:43

google-oss-prow bot added the approved label Apr 4, 2024

kromanow94 added 2 commits April 11, 2024 13:40

Make the kubeflow-m2m-oidc-configurator a CronJob

65932a6

Signed-off-by: Krzysztof Romanowski <krzysztof.romanowski.kr3@roche.com>

cronjob.kubeflow-m2m-oidc-configurator: concurrencyPolicy: Forbid

4abca40

Signed-off-by: Krzysztof Romanowski <krzysztof.romanowski.kr3@roche.com>

kromanow94 force-pushed the make-the-oidc-configurator-a-cronjob branch from b98a24d to 4abca40 Compare April 11, 2024 13:41

juliusvonkohout reviewed Apr 15, 2024

View reviewed changes

juliusvonkohout mentioned this pull request Apr 22, 2024

Manifests WG KF 1.9 tracker #2592

Open

google-oss-prow bot added the do-not-merge/hold label Apr 30, 2024

juliusvonkohout mentioned this pull request May 30, 2024

Central Dashboard not visible #2735

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the kubeflow-m2m-oidc-configurator a CronJob #2667

Make the kubeflow-m2m-oidc-configurator a CronJob #2667

kromanow94 commented Apr 4, 2024

google-oss-prow bot commented Apr 4, 2024

kromanow94 commented Apr 4, 2024 •

edited

juliusvonkohout commented Apr 8, 2024

kromanow94 commented Apr 11, 2024

juliusvonkohout commented Apr 15, 2024 •

edited

kromanow94 commented Apr 15, 2024

juliusvonkohout Apr 15, 2024

kromanow94 Apr 15, 2024

juliusvonkohout Apr 16, 2024

juliusvonkohout Apr 15, 2024

kromanow94 Apr 15, 2024

juliusvonkohout Apr 15, 2024

kromanow94 Apr 15, 2024

juliusvonkohout commented Apr 15, 2024

kromanow94 commented Apr 15, 2024

juliusvonkohout commented Apr 16, 2024

juliusvonkohout commented Apr 30, 2024

juliusvonkohout commented Apr 30, 2024

juliusvonkohout commented Apr 30, 2024

juliusvonkohout commented Apr 30, 2024

juliusvonkohout commented May 13, 2024

Make the kubeflow-m2m-oidc-configurator a CronJob #2667

Are you sure you want to change the base?

Make the kubeflow-m2m-oidc-configurator a CronJob #2667

Conversation

kromanow94 commented Apr 4, 2024

google-oss-prow bot commented Apr 4, 2024

kromanow94 commented Apr 4, 2024 • edited

juliusvonkohout commented Apr 8, 2024

kromanow94 commented Apr 11, 2024

juliusvonkohout commented Apr 15, 2024 • edited

kromanow94 commented Apr 15, 2024

juliusvonkohout Apr 15, 2024

Choose a reason for hiding this comment

kromanow94 Apr 15, 2024

Choose a reason for hiding this comment

juliusvonkohout Apr 16, 2024

Choose a reason for hiding this comment

juliusvonkohout Apr 15, 2024

Choose a reason for hiding this comment

kromanow94 Apr 15, 2024

Choose a reason for hiding this comment

juliusvonkohout Apr 15, 2024

Choose a reason for hiding this comment

kromanow94 Apr 15, 2024

Choose a reason for hiding this comment

juliusvonkohout commented Apr 15, 2024

kromanow94 commented Apr 15, 2024

juliusvonkohout commented Apr 16, 2024

juliusvonkohout commented Apr 30, 2024

juliusvonkohout commented Apr 30, 2024

juliusvonkohout commented Apr 30, 2024

juliusvonkohout commented Apr 30, 2024

juliusvonkohout commented May 13, 2024

kromanow94 commented Apr 4, 2024 •

edited

juliusvonkohout commented Apr 15, 2024 •

edited