Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix chunking strategy serialization and field visibility #288

Merged
merged 3 commits into from
Nov 18, 2024

Conversation

nikessel
Copy link
Contributor

Fix chunking strategy serialization and field visibility

Description

This PR fixes three issues with chunking strategies in the vector stores API:

  1. Incorrect serialization format for static chunking strategy
  2. Incorrect casing in enum variant serialization ("Static" vs "static")
  3. Private fields preventing users from setting chunking parameters

Changes

Serialization Fix

The OpenAI API expects the static chunking configuration to be nested under a "static" field, but the current implementation serializes it directly. This causes the error:

"Missing required parameter: 'chunking_strategy.static'."

Additionally, the enum variants were serializing with incorrect casing, causing:

"Invalid value: 'Static'. Supported values are: 'auto' and 'static'."

Changed from:

#[derive(Clone, Serialize, Debug, Deserialize, PartialEq, Default)]
#[serde(tag = "type")]
pub enum VectorStoreChunkingStrategy {
    #[default]
    Auto,
    Static(StaticChunkingStrategy),
}

To:

#[derive(Clone, Serialize, Debug, Deserialize, PartialEq, Default)]
#[serde(tag = "type")]
pub enum VectorStoreChunkingStrategy {
    #[default]
    #[serde(rename = "auto")]
    Auto,
    #[serde(rename = "static")]
    Static {
        #[serde(rename = "static")]
        config: StaticChunkingStrategy,
    },
}

Field Visibility

Made fields in StaticChunkingStrategy public to allow users to configure chunking parameters:

#[derive(Clone, Serialize, Debug, Deserialize, PartialEq, Default)]
pub struct StaticChunkingStrategy {
    pub max_chunk_size_tokens: u16,
    pub chunk_overlap_tokens: u16,
}

Copy link
Owner

@64bit 64bit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for detailed description and PR!

@64bit 64bit merged commit 9274131 into 64bit:main Nov 18, 2024
ifsheldon pushed a commit to ifsheldon/async-openai-wasm that referenced this pull request Nov 23, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
* Fix AssistantVectorStoreChunkingStrategy rename and make StaticChunkingStrategy fields public

* Fix VectorStoreChunkingStrategy rename

* Fix chunking stategy static field name

---------

Co-authored-by: Niklas <71767810+niklass-l@users.noreply.github.com>
(cherry picked from commit 9274131)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants