-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: new text embedding for sparse vector #466
base: main
Are you sure you want to change the base?
Conversation
The failed CI is due to an upstream uncompatiablity:
|
2089054
to
0e0c58b
Compare
The reason why CI fails is that |
src/datatype/text_svecf32.rs
Outdated
if *x != F32::zero() { | ||
match need_splitter { | ||
true => { | ||
buffer.push_str(format!("{}:{}", i + 1, x).as_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel not good about indexing from 1
. It's not consistent with subscripting.
69e527e
to
2823e00
Compare
Let's hold this PR for now, due to the conflict between 1-based array and 0-based array |
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
Signed-off-by: usamoi <usamoi@outlook.com>
This is used to support bm25 extension. It can produce string instead of depending on pgvecto.rs/pgvector. cc @cutecutecat |
return Err(ParseVectorError::BadParentheses { character: '{' }); | ||
}; | ||
let mut token: ArrayVec<u8, 48> = ArrayVec::new(); | ||
let mut capacity = reserve; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It reserves too large since vector is sparse.
Part of #459
proc_macro_byte_character
from upstreamReminder
The index is from 1 instead of 0 at pgvector