generated from caikit/caikit-template
-
Notifications
You must be signed in to change notification settings - Fork 51
Permalink
Choose a base ref
{{ refName }}
default
Choose a head ref
{{ refName }}
default
Comparing changes
Choose two branches to see what’s changed or to start a new pull request.
If you need to, you can also or
learn more about diff comparisons.
Open a pull request
Create a new pull request by comparing changes across two branches. If you need to, you can also .
Learn more about diff comparisons here.
base repository: caikit/caikit-nlp
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.5.6
Could not load branches
Nothing to show
Loading
Could not load tags
Nothing to show
{{ refName }}
default
Loading
...
head repository: caikit/caikit-nlp
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.5.7
Could not load branches
Nothing to show
Loading
Could not load tags
Nothing to show
{{ refName }}
default
Loading
- 8 commits
- 3 files changed
- 2 contributors
Commits on Sep 11, 2024
-
CrossEncoderModule with rerank API
This module is closely related to EmbeddingModule. Cross-encoder models use Q and A pairs and are trained return a relevance score for rank(). The existing rerank APIs in EmbeddingModule had to encode Q and A separately and use cosine similarity as a score. So the API is the same, but the results are supposed to be better (and slower). Cross-encoder models do not support returning embedding vectors or sentence-similarity. Support for the existing tokenization and model_info endpoints was also added. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 5b0989f - Browse repository at this point
Copy the full SHA 5b0989fView commit details
Commits on Sep 12, 2024
-
Cross-encoder improvements from code review
* mostly removing unnecessary code * some better clarity Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 7146ffe - Browse repository at this point
Copy the full SHA 7146ffeView commit details -
* The already borrowed errors are fixed with tokenizers per thread, so there were some misleading comments about not changing params for truncation (which we do for cross-encoder truncation). Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for ac46993 - Browse repository at this point
Copy the full SHA ac46993View commit details -
Cross-Encoder use configurable batch size.
Default is 32. Can override with embedding batch_size in config or EMBEDDING_BATCH_SIZE env var. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 4e9c5aa - Browse repository at this point
Copy the full SHA 4e9c5aaView commit details -
Cross-encoder: Move truncation check and add tests
* Moved the truncation check to a place that can determine the proper index for the error message (with batching). * Added test to validate some results after truncation. This is with a tiny model, but works for sanity. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 211668a - Browse repository at this point
Copy the full SHA 211668aView commit details -
Cross-encoder: fix truncation test
The part that really tests that a token is truncated was wrong. * It was backwards and passing because the scores are sorted by rank * Using the index to get scores in the order of the inputs * Now correctly xx != xy but xy == xyz (truncated z) Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 2cb6183 - Browse repository at this point
Copy the full SHA 2cb6183View commit details -
Cross-encoder: remove some unused and tidy up some comments
Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 8fa67cc - Browse repository at this point
Copy the full SHA 8fa67ccView commit details -
Merge pull request #389 from markstur/crossencoder
CrossEncoderModule with rerank API
Configuration menu - View commit details
-
Copy full SHA for 1695c3b - Browse repository at this point
Copy the full SHA 1695c3bView commit details
There are no files selected for viewing