Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more query analysis docs #18358

Merged
merged 7 commits into from
Mar 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
190 changes: 190 additions & 0 deletions docs/docs/use_cases/query_analysis/how_to/constructing-filters.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
{
"cells": [
{
"cell_type": "raw",
"id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
"metadata": {},
"source": [
"---\n",
"sidebar_position: 6\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "f2195672-0cab-4967-ba8a-c6544635547d",
"metadata": {},
"source": [
"# Construct Filters\n",
"\n",
"We may want to do query analysis to extract filters to pass into retrievers. One way we ask the LLM to represent these filters is as a Pydantic model. There is then the issue of converting that Pydantic model into a filter that can be passed into a retriever. \n",
"\n",
"This can be done manually, but LangChain also provides some \"Translators\" that are able to translate from a common syntax into filters specific to each retriever. Here, we will cover how to use those translators."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "8ca446a0",
"metadata": {},
"outputs": [],
"source": [
"from typing import Optional\n",
"\n",
"from langchain.chains.query_constructor.ir import (\n",
" Comparator,\n",
" Comparison,\n",
" Operation,\n",
" Operator,\n",
" StructuredQuery,\n",
")\n",
"from langchain.retrievers.self_query.chroma import ChromaTranslator\n",
"from langchain.retrievers.self_query.elasticsearch import ElasticsearchTranslator\n",
"from langchain_core.pydantic_v1 import BaseModel"
]
},
{
"cell_type": "markdown",
"id": "bc1302ff",
"metadata": {},
"source": [
"In this example, `year` and `author` are both attributes to filter on."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "64055006",
"metadata": {},
"outputs": [],
"source": [
"class Search(BaseModel):\n",
" query: str\n",
" start_year: Optional[int]\n",
" author: Optional[str]"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "44eb6d98",
"metadata": {},
"outputs": [],
"source": [
"search_query = Search(query=\"RAG\", start_year=2022, author=\"LangChain\")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "e8ba6705",
"metadata": {},
"outputs": [],
"source": [
"def construct_comparisons(query: Search):\n",
" comparisons = []\n",
" if query.start_year is not None:\n",
" comparisons.append(\n",
" Comparison(\n",
" comparator=Comparator.GT,\n",
" attribute=\"start_year\",\n",
" value=query.start_year,\n",
" )\n",
" )\n",
" if query.author is not None:\n",
" comparisons.append(\n",
" Comparison(\n",
" comparator=Comparator.EQ,\n",
" attribute=\"author\",\n",
" value=query.author,\n",
" )\n",
" )\n",
" return comparisons"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "6a79c9da",
"metadata": {},
"outputs": [],
"source": [
"comparisons = construct_comparisons(search_query)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "2d0e9689",
"metadata": {},
"outputs": [],
"source": [
"_filter = Operation(operator=Operator.AND, arguments=comparisons)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "e4c0b2ce",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'bool': {'must': [{'range': {'metadata.start_year': {'gt': 2022}}},\n",
" {'term': {'metadata.author.keyword': 'LangChain'}}]}}"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ElasticsearchTranslator().visit_operation(_filter)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "d75455ae",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'$and': [{'start_year': {'$gt': 2022}}, {'author': {'$eq': 'LangChain'}}]}"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ChromaTranslator().visit_operation(_filter)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
"id": "f2195672-0cab-4967-ba8a-c6544635547d",
"metadata": {},
"source": [
"# Adding examples to the prompt\n",
"# Add Examples to the Prompt\n",
"\n",
"As our query analysis becomes more complex, adding examples to the prompt can meaningfully improve performance.\n",
"As our query analysis becomes more complex, the LLM may struggle to understand how exactly it should respond in certain scenarios. In order to improve performance here, we can add examples to the prompt to guide the LLM.\n",
"\n",
"Let's take a look at how we can add examples for the LangChain YouTube video query analyzer we built in the [Quickstart](/docs/use_cases/query_analysis/quickstart)."
]
Expand Down Expand Up @@ -377,7 +377,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.1"
}
},
"nbformat": 4,
Expand Down