Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix retry after - cooldown individual models based on their specific 'retry-after' header #5358

Merged
merged 10 commits into from
Aug 27, 2024

Conversation

krrishdholakia
Copy link
Contributor

Title

cooldown individual models based on their specific 'retry-after' header

  • Enables using azure/openai's retry-after headers, for model cooldowns (sliding window for rate limit monitoring)
  • Returns more accurate 'retry-after' in the error messages, based on what's actually returned from provider

Relevant issues

Closes #1339
Closes #3065

Type

🐛 Bug Fix
🧹 Refactoring
🚄 Infrastructure

Changes

  • introduces new CooldownCache to manage caching logic for model cooldowns
  • exposes new get_min_cooldown_time to return minimum time to wait before retrying request
  • refactors RouterRateLimitErrors to use a common class, to consistently return the same information.

[REQUIRED] Testing - Attach a screenshot of any new tests passing locall

If UI changes, send a screenshot/GIF of working UI fixes

  • new test which asserts that returned error matches cooldown time in error string returned by router

…resent

Fixes issue where retry after on router was not using azure / openai numbers
…n openai chat completion + embedding endpoints
Updates cooldown logic to cooldown individual models

 Closes #1339
Copy link

vercel bot commented Aug 25, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 27, 2024 5:50pm

@krrishdholakia krrishdholakia merged commit 415abc8 into main Aug 27, 2024
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant