Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix triton streaming completions bug #8386

Merged
merged 5 commits into from
Mar 10, 2025

Conversation

minwhoo
Copy link
Contributor

@minwhoo minwhoo commented Feb 8, 2025

Title

Reimplements triton streaming handling code that got lost during refactoring.

  • For streaming, the api_base url should be appended with "_stream" suffix.
  • get_model_response_iterator method returns TritonResponseIterator

Relevant issues

Fixes #8362

Type

🐛 Bug Fix

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Copy link

vercel bot commented Feb 8, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 13, 2025 7:42am

Copy link
Contributor

@ishaan-jaff ishaan-jaff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a test in llm translation

@ishaan-jaff
Copy link
Contributor

bump on this @minwhoo

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@minwhoo
Copy link
Contributor Author

minwhoo commented Feb 13, 2025

Added streaming test

@krrishdholakia
Copy link
Contributor

Thanks @minwhoo can you share a screenshot of it passing.

Should be good to merge once that's up! 👍

Appreciate your work on this

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@minwhoo
Copy link
Contributor Author

minwhoo commented Feb 13, 2025

@krrishdholakia Sure thing. Managed to catch a mistake I made. Would this screenshot suffice?
Screenshot 2025-02-13 at 4 44 55 PM

@minwhoo minwhoo requested a review from ishaan-jaff February 21, 2025 09:56
Copy link
Contributor

@ishaan-jaff ishaan-jaff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ishaan-jaff ishaan-jaff merged commit 94667e1 into BerriAI:main Mar 10, 2025
2 checks passed
@ishaan-jaff
Copy link
Contributor

Hi @minwhoo would you be open to working with us on LiteLLM as a founding engineer ? (I'm impressed by your work on this)

If this sounds interesting here's my cal, please book some time: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

@minwhoo
Copy link
Contributor Author

minwhoo commented Mar 12, 2025

@ishaan-jaff Hey, thanks for the offer—really appreciate it! But at the moment I ’m unable to take on any new commitments. Really great product though. Best of luck on your startup!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Regression in triton completions streaming
3 participants