Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[frontend] Latency when listing runs #10778

Closed
droctothorpe opened this issue May 3, 2024 · 7 comments
Closed

[frontend] Latency when listing runs #10778

droctothorpe opened this issue May 3, 2024 · 7 comments

Comments

@droctothorpe
Copy link
Contributor

droctothorpe commented May 3, 2024

Environment

  • How did you deploy Kubeflow Pipelines (KFP)?
    Using the AWS helm chart.
  • KFP version:
    2.0.5

Steps to reproduce

  • Create 10+ runs with the same experiment.
  • Go to the runs page.
  • The frontend will make 10+ requests to the same get experiment endpoint: https://<domain>/pipeline/apis/v2beta1/experiments/<experiment id>.
  • The concurrent requests will often result in latency per request in the 1+ second range which is cumulative. Some of our users are experiencing cumulative delays in the 15+ second range. Sometimes, the runs page never loads at all.
  • It may taken several refreshes to recreate. The error is somewhat intermittent.

Expected result

In an ideal world, the frontend would cache the experiment and not have to make the same request 10 times. In addition, even without caching, 10 experiment API requests probably shouldn't result in this much latency. Also, might be a red herring, but the experiment requests all include a Cache-Control: no-cache header. cc @owmasch.


Impacted by this bug? Give it a 👍.

@rimolive
Copy link
Member

rimolive commented May 4, 2024

I saw your comment in #10230, but I believe those are different issues. Although the old issue says it's a frontend issue, we discovered that it was a Database issue, and with the indexes creation we could improve performance.

Now the challenge is figure out what's going on in frontend that makes to much repetitive requests. Thank you for describing the steps to reproduce, I'll follow these steps and see if there's something we can do to propose a workaround until we have the actual fix.

@droctothorpe
Copy link
Contributor Author

Thank you, @rimolive. We will continue investigating workarounds on our end as well.

@droctothorpe
Copy link
Contributor Author

Found in the KFP Pipeline logs. Seems very relevant. Continuing to investigate but want to share now in case it's helpful for anyone else who is debugging:

 Waited for 1.133348867s due to client-side throttling, not priority and fairness, request: POST:https://<obfuscated>/apis/authorization.k8s.io/v1/subjectaccessreviews

@droctothorpe
Copy link
Contributor Author

I think we just need to adjust the QPS and burst when instantiating the K8s client for the KFP API. Will validate the fix and update here.

@droctothorpe
Copy link
Contributor Author

After investigating more thoroughly with @tarat44, it seems like modifying the QPS is probably not the optimal solution.

We're going to try updating the list runs page to make a single call to https://www.kubeflow.org/docs/components/pipelines/v1/reference/api/kubeflow-pipeline-api-spec/#operation--apis-v1beta1-experiments-get instead of individual calls for each experiment id and then parse the larger response for reduced latency.

Please let us know if this seems reasonable. Thanks!

@thesuperzapper
Copy link
Member

@droctothorpe that seems reasonable.

Are there any other places which are making repeated requests that can be either cached, or combined?

Also, I guess we could have the backend cache the result of the subject access review for a few seconds.

@droctothorpe
Copy link
Contributor Author

Thanks, @thesuperzapper. Just submitted a preliminary PR. Still have to update tests but waiting for design review before doing so. Going through the frontend code, it definitely seemed like there may be other opportunities to reduce latency. Here is another example where promise.all is used when a call to list pipelines could be used instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants