Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I get prompts_tokens? #10

Closed
magelikescoke opened this issue May 26, 2023 · 3 comments
Closed

How can I get prompts_tokens? #10

magelikescoke opened this issue May 26, 2023 · 3 comments
Labels

Comments

@magelikescoke
Copy link

magelikescoke commented May 26, 2023

what parameter should i pass to get the same prompts_tokens as returned by openai.

{
  "messages": [
        {
            "role": "system",
            "content": ""
        },
        {
            "role": "user",
            "content": "你好"
        },
        {
            "role": "assistant",
            "content": "你好!有什么我可以帮助你的吗?"
        },
        {
            "role": "user",
            "content": "介绍下你自己"
        }
    ]
}

This is my prompt, how to stringfy it and pass to encode?

@walternicholas
Copy link

Same question as above.

I've tried running encode on each of the "content" values, and summing those, as well as running encode on the JSON.stringify(entireMessagesArray).

The first method (summation of messages alone) gave me about 200 tokens less than what the actual OpenAI returned for "prompt_tokens", and the second overshot by about 200. For reference, this was on a request with 2994 prompt_tokens.

I'm using the "gpt-3.5-turbo" model, and importing the default encode from "gpt-tokenizer" (which according to the docus should align with gpt-3.5-turbo).

@niieani niieani closed this as completed in ff30f11 Jun 1, 2023
@niieani
Copy link
Owner

niieani commented Jun 1, 2023

Should be fixed in new version - there's a new API called encodeChat. See the updated README for details.
Let me know if you still have issues.

@github-actions
Copy link

github-actions bot commented Jun 1, 2023

🎉 This issue has been resolved in version 2.1.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants