nvidia-trt:add TritonTensorRTLLM(verbose_client=False) #16848

mkhludnev · 2024-01-31T21:52:21Z

Description: adding verbose flag to TritonTensorRTLLM,
Issue: nope,
Dependencies: not any,
Twitter handle:

vercel · 2024-01-31T21:52:25Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchain	⬜️ Ignored (Inspect)	Visit Preview		Mar 3, 2024 7:42am

mkhludnev · 2024-02-01T09:44:16Z

libs/partners/nvidia-trt/tests/unit_tests/test_llms.py

+    captured = StringIO()
+    sys.stdout = captured
+    with pytest.raises(InferenceServerException):
+        llm.client.is_server_live()


this is not perfect since it tries to request this address anyway, it might cause cloud/CI/security/etc issues. I don't know. Open for any other ideas.

improved here in the recent push

baskaryan

LLMs already have a verbose attribute which is meant for configuring callbacks. should we give this a difference name? maybe client_verbose?

make no network attempts in unit test

mkhludnev · 2024-03-03T07:47:02Z

@baskaryan let me know if I can improve it further.

…hain-ai#16848) - **Description:** adding verbose flag to TritonTensorRTLLM, - **Issue:** nope, - **Dependencies:** not any, - **Twitter handle:**

efriis added the partner label Jan 31, 2024

efriis self-assigned this Jan 31, 2024

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jan 31, 2024

dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:improvement Medium size change to existing code to handle new use-cases labels Jan 31, 2024

mkhludnev mentioned this pull request Feb 1, 2024

Support TensorRT-LLM? #12474

Open

mkhludnev commented Feb 1, 2024

View reviewed changes

baskaryan reviewed Feb 13, 2024

View reviewed changes

mkhludnev requested a review from baskaryan February 13, 2024 20:35

mkhludnev changed the title ~~nvidia-trt:add TritonTensorRTLLM(verbose=False)~~ nvidia-trt:add TritonTensorRTLLM(verbose_client=False) Feb 13, 2024

mkhludnev added 4 commits March 3, 2024 10:42

add TritonTensorRTLLM(verbose=False)

20f7d5f

linter fix

7d2c7f7

rename verbose_client

c6a82a4

make no network attempts in unit test

ruff

9f05293

mkhludnev force-pushed the nvidia-trt-verbose branch from ee7d24d to 9f05293 Compare March 3, 2024 07:42

baskaryan approved these changes Mar 5, 2024

View reviewed changes

dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Mar 5, 2024

baskaryan merged commit d039dcb into langchain-ai:master Mar 5, 2024
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvidia-trt:add TritonTensorRTLLM(verbose_client=False) #16848

nvidia-trt:add TritonTensorRTLLM(verbose_client=False) #16848

mkhludnev commented Jan 31, 2024

vercel bot commented Jan 31, 2024 •

edited

mkhludnev Feb 1, 2024

mkhludnev Feb 13, 2024

baskaryan left a comment •

edited

mkhludnev commented Mar 3, 2024

nvidia-trt:add TritonTensorRTLLM(verbose_client=False) #16848

nvidia-trt:add TritonTensorRTLLM(verbose_client=False) #16848

Conversation

mkhludnev commented Jan 31, 2024

vercel bot commented Jan 31, 2024 • edited

mkhludnev Feb 1, 2024

Choose a reason for hiding this comment

mkhludnev Feb 13, 2024

Choose a reason for hiding this comment

baskaryan left a comment • edited

Choose a reason for hiding this comment

mkhludnev commented Mar 3, 2024

vercel bot commented Jan 31, 2024 •

edited

baskaryan left a comment •

edited