Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broadcast operation failure while using simsimd beyond v3.7.7 #22271

Merged
merged 12 commits into from
Jun 4, 2024
Merged

Broadcast operation failure while using simsimd beyond v3.7.7 #22271

merged 12 commits into from
Jun 4, 2024

Conversation

joydeepbroy-zeotap
Copy link
Contributor

@joydeepbroy-zeotap joydeepbroy-zeotap commented May 29, 2024

  • Packages affected:

    • community: fix cosine_similarity to support simsimd beyond 3.7.7
    • partners/milvus: fix cosine_similarity to support simsimd beyond 3.7.7
    • partners/mongodb: fix cosine_similarity to support simsimd beyond 3.7.7
    • partners/pinecone: fix cosine_similarity to support simsimd beyond 3.7.7
    • partners/qdrant: fix cosine_similarity to support simsimd beyond 3.7.7
  • Broadcast operation failure while using simsimd beyond v3.7.7:

Screenshot 2024-05-29 at 2 50 00 PM
  • Considerations:
    1. I started with community but since similar changes were there in Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well. If touching multiple packages in one PR is not the norm, then I can remove them from this PR and raise separate ones
    2. I have run and verified that the tests work. Since, only MongoDB had tests, I ran theirs and verified it works as well. Screenshots attached :
Screenshot 2024-05-29 at 2 52 13 PM Screenshot 2024-05-29 at 3 33 51 PM

I have added a test for simsimd. I feel it may not go well with the CI/CD setup as installing simsimd is not a dependency requirement. I have just imported simsimd to ensure simsimd cosine similarity is invoked. However, its not a good approach. Suggestions are welcome and I can make the required changes on the PR. Please provide guidance on the same as I am new to the community.

Sorry, something went wrong.

@efriis efriis added the partner label May 29, 2024
Copy link

vercel bot commented May 29, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 4, 2024 5:36pm

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label May 29, 2024
@efriis efriis self-assigned this May 29, 2024
@dosubot dosubot bot added 🔌: milvus Primarily related to Milvus vector store integration 🔌: mongo Primarily related to Mongo integrations 🔌: pinecone Primarily related to Pinecone vector store integration 🔌: qdrant Primarily related to Qdrant vector store integration 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels May 29, 2024
@joydeepbroy-zeotap
Copy link
Contributor Author

@efriis @baskaryan please review

Copy link
Collaborator

@baskaryan baskaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashvardanian
Copy link
Contributor

Patches make sense to me 🤗

baskaryan added 3 commits June 3, 2024 15:50
fmt
fmt
fmt
@joydeepbroy-zeotap
Copy link
Contributor Author

joydeepbroy-zeotap commented Jun 4, 2024

Thanks @baskaryan for the unit test fix. What are the next steps on this?

baskaryan and others added 3 commits June 4, 2024 10:24
@baskaryan baskaryan enabled auto-merge (squash) June 4, 2024 17:25
@baskaryan baskaryan merged commit 3796672 into langchain-ai:master Jun 4, 2024
97 checks passed
hinthornw pushed a commit that referenced this pull request Jun 20, 2024
…re while using simsimd beyond v3.7.7 (#22271)

- [ ] **Packages affected**: 
  - community: fix `cosine_similarity` to support simsimd beyond 3.7.7
- partners/milvus: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/mongodb: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/pinecone: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/qdrant: fix `cosine_similarity` to support simsimd beyond
3.7.7


- [ ] **Broadcast operation failure while using simsimd beyond v3.7.7**:
- **Description:** I was using simsimd 4.3.1 and the unsupported operand
type issue popped up. When I checked out the repo and ran the tests,
they failed as well (have attached a screenshot for that). Looks like it
is a variant of #18022 .
Prior to 3.7.7, simd.cdist returned an ndarray but now it returns
simsimd.DistancesTensor which is ineligible for a broadcast operation
with numpy. With this change, it also remove the need to explicitly cast
`Z` to numpy array
    - **Issue:** #19905
    - **Dependencies:** No
    - **Twitter handle:** https://x.com/GetzJoydeep

<img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56">

- [ ] **Considerations**: 
1. I started with community but since similar changes were there in
Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well.
If touching multiple packages in one PR is not the norm, then I can
remove them from this PR and raise separate ones
2. I have run and verified that the tests work. Since, only MongoDB had
tests, I ran theirs and verified it works as well. Screenshots attached
:
<img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9">
<img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5">
  

I have added a test for simsimd. I feel it may not go well with the
CI/CD setup as installing simsimd is not a dependency requirement. I
have just imported simsimd to ensure simsimd cosine similarity is
invoked. However, its not a good approach. Suggestions are welcome and I
can make the required changes on the PR. Please provide guidance on the
same as I am new to the community.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: milvus Primarily related to Milvus vector store integration 🔌: mongo Primarily related to Mongo integrations partner 🔌: pinecone Primarily related to Pinecone vector store integration 🔌: qdrant Primarily related to Qdrant vector store integration size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants