-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved PGVector metadata filtering (no breaking changes) #12977
Improved PGVector metadata filtering (no breaking changes) #12977
Conversation
Example filters: {"column":"value"} will result in: WHERE langchain_pg_embedding.collection_id = 'xxxxx'::uuid::UUID AND (langchain_pg_embedding.cmetadata ->> 'column') = 'value' {"column": {"in": ["value1", "value2"]}} will result in: WHERE langchain_pg_embedding.collection_id = 'xxxxx'::uuid::UUID AND (langchain_pg_embedding.cmetadata ->> 'column') IN ('value1', 'value2') {"and":[ ¦ {"or":[ ¦ ¦ {"column1": "value1"}, ¦ ¦ {"column2": "value2"} ¦ ]}, ¦ {"or":[ ¦ ¦ {"column3": "value3"}, ¦ ¦ {"column3": {"like": "value4%"}} ¦ ]} ]} will result in: WHERE langchain_pg_embedding.collection_id = 'xxxxx'::uuid::UUID ¦ AND ((langchain_pg_embedding.cmetadata ->> 'column1') = 'value1' ¦ ¦ OR (langchain_pg_embedding.cmetadata ->> 'column2') = 'value2') ¦ AND ((langchain_pg_embedding.cmetadata ->> 'column3') = 'value3' ¦ ¦ OR langchain_pg_embedding.cmetadata ->> 'column3' LIKE 'value4%')
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
4cd7e23
to
51e3352
Compare
Any update on this PR? Our team is unable to use langchain with PGVector due to its lack of support for "OR" filter. Thanks |
@raghav-knowbe4 The PR is ready for review by the langchain developers. Not sure the next steps to get it reviewed. I can change the format of things to use the Pinecone format but the issue is that "in" was already added with the format "in" so i just followed that to keep things consistent within the module. It would be good to have a standard across all the different data modules in langchain but thats complicated as it would be a breaking change for some modules. |
Closing in favor of: #18992 |
@bradfordben thank you for the contribution sorry it took so long to review the PR. I made some larger changes to get the filter application working well for postgres. |
Added more complex filtering of metadata to PGVector
Example filters:
{"column":"value"}
will result in:
WHERE langchain_pg_embedding.collection_id = 'xxxxx'::uuid::UUID AND (langchain_pg_embedding.cmetadata ->> 'column') = 'value'
{"column": {"in": ["value1", "value2"]}}
will result in:
WHERE langchain_pg_embedding.collection_id = 'xxxxx'::uuid::UUID AND (langchain_pg_embedding.cmetadata ->> 'column') IN ('value1', 'value2')
{"and":[
¦ {"or":[
¦ ¦ {"column1": "value1"},
¦ ¦ {"column2": "value2"}
¦ ]},
¦ {"or":[
¦ ¦ {"column3": "value3"},
¦ ¦ {"column3": {"like": "value4%"}}
¦ ]}
]}
will result in:
WHERE langchain_pg_embedding.collection_id = 'xxxxx'::uuid::UUID
¦ AND ((langchain_pg_embedding.cmetadata ->> 'column1') = 'value1'
¦ ¦ OR (langchain_pg_embedding.cmetadata ->> 'column2') = 'value2')
¦ AND ((langchain_pg_embedding.cmetadata ->> 'column3') = 'value3'
¦ ¦ OR langchain_pg_embedding.cmetadata ->> 'column3' LIKE 'value4%')