-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
community: Add Dria retriever #17098
Merged
Merged
Changes from 4 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
dedf13f
Dria integration added
anilaltuner da9de64
Versioning update
anilaltuner 2d4669f
Versioning update
anilaltuner 97f6334
lint fixes, version issue cause of javelin-sdk
anilaltuner 97f2f3c
Dria - import lib fixes
anilaltuner fa4c7eb
Merge branch 'master' into master
anilaltuner f0c11bd
Remove Imports libs/langchain
anilaltuner 5528500
Merge remote-tracking branch 'origin/master'
anilaltuner 7573925
Merge branch 'master' into master
anilaltuner e8dfdb6
Merge branch 'master' into master
anilaltuner c1c6516
Merge branch 'master' into master
anilaltuner d00ce88
Remove empty line of imports
anilaltuner 8b23a9e
Remove empty line of imports
anilaltuner File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,191 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "UYyFIEKEkmHb" | ||
}, | ||
"source": [ | ||
"# Dria\n", | ||
"\n", | ||
"Dria is a hub of public RAG models for developers to both contribute and utilize a shared embedding lake. This notebook demonstrates how to use the Dria API for data retrieval tasks." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "VNTFUgK9kmHd" | ||
}, | ||
"source": [ | ||
"# Installation\n", | ||
"\n", | ||
"Ensure you have the `dria` package installed. You can install it using pip:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "X--1A8EEkmHd" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"%pip install --upgrade --quiet dria" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "xRbRL0SgkmHe" | ||
}, | ||
"source": [ | ||
"# Configure API Key\n", | ||
"\n", | ||
"Set up your Dria API key for access." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": { | ||
"id": "hGqOByNMkmHe" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import os\n", | ||
"\n", | ||
"os.environ[\"DRIA_API_KEY\"] = \"DRIA_API_KEY\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "nDfAEqQtkmHe" | ||
}, | ||
"source": [ | ||
"# Initialize Dria Retriever\n", | ||
"\n", | ||
"Create an instance of `DriaRetriever`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "vlyorgCckmHe" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain.retrievers import DriaRetriever\n", | ||
"\n", | ||
"api_key = os.getenv(\"DRIA_API_KEY\")\n", | ||
"retriever = DriaRetriever(api_key=api_key)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "j7WUY5jBOLQd" | ||
}, | ||
"source": [ | ||
"# **Create Knowledge Base**\n", | ||
"\n", | ||
"Create a knowledge on [Dria's Knowledge Hub](https://dria.co/knowledge)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "L5ER81eWOKnt" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"contract_id = retriever.create_knowledge_base(\n", | ||
" name=\"France's AI Development\",\n", | ||
" embedding=DriaRetriever.models.jina_embeddings_v2_base_en.value,\n", | ||
" category=\"Artificial Intelligence\",\n", | ||
" description=\"Explore the growth and contributions of France in the field of Artificial Intelligence.\",\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "9VCTzSFpkmHe" | ||
}, | ||
"source": [ | ||
"# Add Data\n", | ||
"\n", | ||
"Load data into your Dria knowledge base." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "xeTMafIekmHf" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"texts = [\n", | ||
" \"The first text to add to Dria.\",\n", | ||
" \"Another piece of information to store.\",\n", | ||
" \"More data to include in the Dria knowledge base.\",\n", | ||
"]\n", | ||
"\n", | ||
"ids = retriever.add_texts(texts)\n", | ||
"print(\"Data added with IDs:\", ids)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "dy1UlvLCkmHf" | ||
}, | ||
"source": [ | ||
"# Retrieve Data\n", | ||
"\n", | ||
"Use the retriever to find relevant documents given a query." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "9y3msv9tkmHf" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"query = \"Find information about Dria.\"\n", | ||
"result = retriever.get_relevant_documents(query)\n", | ||
"for doc in result:\n", | ||
" print(doc)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"colab": { | ||
"provenance": [] | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.x" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
87 changes: 87 additions & 0 deletions
87
libs/community/langchain_community/retrievers/dria_index.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
"""Wrapper around Dria Retriever.""" | ||
|
||
from typing import List, Optional | ||
|
||
from langchain_core.callbacks import CallbackManagerForRetrieverRun | ||
from langchain_core.documents import Document | ||
from langchain_core.retrievers import BaseRetriever | ||
|
||
from langchain_community.utilities import DriaAPIWrapper | ||
|
||
|
||
class DriaRetriever(BaseRetriever): | ||
"""`Dria` retriever using the DriaAPIWrapper.""" | ||
|
||
api_wrapper: DriaAPIWrapper | ||
|
||
def __init__(self, api_key: str, contract_id: Optional[str] = None): | ||
""" | ||
Initialize the DriaRetriever with a DriaAPIWrapper instance. | ||
|
||
Args: | ||
api_key: The API key for Dria. | ||
contract_id: The contract ID of the knowledge base to interact with. | ||
""" | ||
super().__init__() | ||
self.api_wrapper = DriaAPIWrapper(api_key=api_key, contract_id=contract_id) | ||
|
||
def create_knowledge_base( | ||
self, | ||
name: str, | ||
description: str, | ||
category: str = "Unspecified", | ||
embedding: str = "jina", | ||
) -> str: | ||
"""Create a new knowledge base in Dria. | ||
|
||
Args: | ||
name: The name of the knowledge base. | ||
description: The description of the knowledge base. | ||
category: The category of the knowledge base. | ||
embedding: The embedding model to use for the knowledge base. | ||
|
||
|
||
Returns: | ||
The ID of the created knowledge base. | ||
""" | ||
response = self.api_wrapper.create_knowledge_base( | ||
name, description, category, embedding | ||
) | ||
return response | ||
|
||
def add_texts( | ||
self, | ||
texts: List, | ||
) -> None: | ||
"""Add texts to the Dria knowledge base. | ||
|
||
Args: | ||
texts: An iterable of texts and metadatas to add to the knowledge base. | ||
|
||
Returns: | ||
List of IDs representing the added texts. | ||
""" | ||
data = [{"text": text["text"], "metadata": text["metadata"]} for text in texts] | ||
self.api_wrapper.insert_data(data) | ||
|
||
def _get_relevant_documents( | ||
self, query: str, *, run_manager: CallbackManagerForRetrieverRun | ||
) -> List[Document]: | ||
"""Retrieve relevant documents from Dria based on a query. | ||
|
||
Args: | ||
query: The query string to search for in the knowledge base. | ||
run_manager: Callback manager for the retriever run. | ||
|
||
Returns: | ||
A list of Documents containing the search results. | ||
""" | ||
results = self.api_wrapper.search(query) | ||
docs = [ | ||
Document( | ||
page_content=result["metadata"], | ||
metadata={"id": result["id"], "score": result["score"]}, | ||
) | ||
for result in results | ||
] | ||
return docs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should flip this order