add nvidia nim rerank support #13178

mattf · 2024-04-30T14:19:36Z

Description

this adds the llama-index-postprocessor-nvidia-rerank package for interacting w/ ranking models hosted on ai.nvidia.com.

from llama_index.postprocessor.nvidia_rerank import NVIDIARerank
from llama_index.core.schema import NodeWithScore, Document

texts = [
    "two roads diverged in a yellow wood, and sorry i could not travel both and be one traveler, long i stood and looked down one as far as i could to where it bent in the undergrowth;",
    "then took the other, as just as fair, and having perhaps the better claim because it was grassy and wanted wear, though as for that the passing there had worn them really about the same,",
    "and both that morning equally lay in leaves no step had trodden black. oh, i marked the first for another day! yet knowing how way leads on to way i doubted if i should ever come back.",
    "i shall be telling this with a sigh somewhere ages and ages hense: two roads diverged in a wood, and i, i took the one less traveled by, and that has made all the difference."
]

nodes = [NodeWithScore(node=Document(text=text)) for text in texts]

rerank = NVIDIARerank()
rerank.postprocess_nodes(nodes=nodes, query_str="which way should i go?")

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

Yes
No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

Yes
No
N/A - new package

Type of Change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Added new unit/integration tests
Added new notebook (that tests end-to-end)
I stared at the code and made sure it makes sense

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran make format; make lint to appease the lint gods

Note

Co-authored with Zenodia Charpy

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

…list Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

…works Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

… verify batching works Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

review-notebook-app · 2024-04-30T14:19:41Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Untitled.ipynb

Llama-index_post_processing_reranker.ipynb

logan-markewich · 2024-05-02T03:51:07Z

...ssor/llama-index-postprocessor-nvidia-rerank/llama_index/postprocessor/nvidia_rerank/base.py

+        ids = [DEFAULT_MODEL]
+        return [Model(id=id) for id in ids]
+
+    def mode(


yea again, this feels like a function that isn't needed? It could be just part of the init right?

i agree, this could be part of init. there's low potential for a user to want to create an instance and then mode switch it more than once. this is the design flow we're using consistently across communities. it's also up for review.

Zenodia and others added 26 commits April 30, 2024 10:03

draft implementation of nvidia API catalog reranker on llama-index

0904c03

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

merged LLOC-29 and LLOc-77, able to switch mode between nim and catalog

830e646

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

re-worked for the .mode to support nim and catalog

d7393c0

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

working version except for availablemodels

709e192

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

the get_available_models() is currently only hard-coded as python dict

0f700a2

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

WIP resolving Matthew' opened review threads

3995e0f

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

WIP code to resolve review threads

1c682c7

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

update on resolving threads of raise_status_code and remove sort the …

9028002

…list Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

implement batching with default 32, verified in the notebook that it …

b4e312c

…works Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

updating the notebook with batching implemented and fetch toy data to…

78b9212

… verify batching works Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

remove not used import of _static.py

04fde8b

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

make sure you can run all cells in one go

48d4974

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

update enire llama index folder

d609baa

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

update entire folder

9ccdff9

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

resolving user should not access url

80cba8b

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

making batch processing a private function

399d704

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

modify README for user

e12ac42

Signed-off-by: Zenodia Charpy <zcharpy@nvidia.com>

pass pre-commit formatting and linting

25cf586

add alignment tests

593f111

add basic functionality test

17bb70c

add base implementation of NVIDIARerank

17089df

add available_models support

01a2f11

add NVIDIARerank example notebook

6c7a463

keep base urls to .../v1 (future should remove /v1)

5a293a7

add note about truncation workaround in test

89d3fbd

add basic tool.llamahub config

9dd4079

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Apr 30, 2024

mattf marked this pull request as draft April 30, 2024 14:19

mattf marked this pull request as ready for review April 30, 2024 20:53

build

05e5b17

logan-markewich reviewed May 2, 2024

View reviewed changes

Untitled.ipynb Outdated Show resolved Hide resolved

logan-markewich reviewed May 2, 2024

View reviewed changes

Llama-index_post_processing_reranker.ipynb Outdated Show resolved Hide resolved

logan-markewich reviewed May 2, 2024

View reviewed changes

remove unnecessary notebooks

58ce38d

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels May 2, 2024

logan-markewich approved these changes May 3, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label May 3, 2024

logan-markewich merged commit f79e186 into run-llama:main May 3, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add nvidia nim rerank support #13178

add nvidia nim rerank support #13178

mattf commented Apr 30, 2024

review-notebook-app bot commented Apr 30, 2024

logan-markewich May 2, 2024

mattf May 2, 2024

add nvidia nim rerank support #13178

add nvidia nim rerank support #13178

Conversation

mattf commented Apr 30, 2024

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Note

review-notebook-app bot commented Apr 30, 2024

logan-markewich May 2, 2024

Choose a reason for hiding this comment

mattf May 2, 2024

Choose a reason for hiding this comment