gemini context caching (openai format) support #5381

krrishdholakia · 2024-08-27T04:17:07Z

Title

gemini context caching (openai format)

from litellm import completion 

completion(
model="gemini/gemini-1.5-pro",
messages=[
# System Message
    {
        "role": "system",
        "content": [
            {
                "type": "text",
                "text": "Here is the full text of a complex legal agreement" * 4000,
                "cache_control": {"type": "ephemeral"}, # 👈 KEY CHANGE
            }
        ],
    },
    # marked for caching with the cache_control parameter, so that this checkpoint can read from the previous cache.
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What are the key terms and conditions in this agreement?",
                "cache_control": {"type": "ephemeral"},
            }
        ],
    }]
)

Relevant issues

Closes #4284
Closes #5213

Type

🆕 New Feature

Changes

adds check if messages contains {..cache_control: {"type": "ephemeral"}}
checks if message(s) are in context cache -> if not, it adds them to cache
uses cached value in subsequent requests
Also includes refactoring work for vertex ai / google ai studio, to make it easier to understand how transformations are applied (cc: @yujonglee)

[REQUIRED] Testing - Attach a screenshot of any new tests passing locall

If UI changes, send a screenshot/GIF of working UI fixes

new test added to test_amazing_vertex_completion.py

…lls to vertex ai in a normal chat completion call (anthropic caching format) Closes #5213

…y exists

vercel · 2024-08-27T04:17:10Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Aug 27, 2024 5:23am

NF-Karlo · 2025-02-20T11:59:40Z

Hi, this does not seem to work on the latest version (1.61.11). Running the exact toy example found in this PR , the following error pops up:

NotFoundError: litellm.NotFoundError: VertexAIException - { "error": { "code": 404, "message": "models/gemini-1.5-pro is not found for API version v1beta, or is not supported for createCachedContent. Call ListModels to see the list of available models and their supported methods.", "status": "NOT_FOUND" } }

Edit: https://ai.google.dev/gemini-api/docs/caching?lang=python they added the following:

Note: Context caching is only available for stable models with fixed versions (for example, gemini-1.5-pro-001). You must include the version postfix (for example, the -001 in gemini-1.5-pro-001).

The documentation in litellm should also be updated to account for this change

krrishdholakia added 3 commits August 26, 2024 18:47

feat(vertex_ai_context_caching.py): support making context caching ca…

Loading
Loading status checks…

c83aa80

…lls to vertex ai in a normal chat completion call (anthropic caching format) Closes #5213

feat(vertex_ai_context_caching.py): check gemini cache, if key alread…

Loading
Loading status checks…

5d68f27

…y exists

fix: fix unbound var

Loading
Loading status checks…

592d8e9

fix: fix imports

Loading
Loading status checks…

75bb9ff

vercel bot deployed to Preview August 27, 2024 04:37 View deployment

Merge branch 'main' into litellm_gemini_context_caching

Loading
Loading status checks…

08bd478

krrishdholakia merged commit 81e62ae into main Aug 27, 2024
1 of 3 checks passed

vercel bot deployed to Preview August 27, 2024 05:23 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gemini context caching (openai format) support #5381

gemini context caching (openai format) support #5381

krrishdholakia commented Aug 27, 2024 •

edited

Loading

vercel bot commented Aug 27, 2024 •

edited

Loading

NF-Karlo commented Feb 20, 2025 •

edited

Loading

gemini context caching (openai format) support #5381

gemini context caching (openai format) support #5381

Conversation

krrishdholakia commented Aug 27, 2024 • edited Loading

Title

Relevant issues

Type

Changes

[REQUIRED] Testing - Attach a screenshot of any new tests passing locall

vercel bot commented Aug 27, 2024 • edited Loading

NF-Karlo commented Feb 20, 2025 • edited Loading

krrishdholakia commented Aug 27, 2024 •

edited

Loading

vercel bot commented Aug 27, 2024 •

edited

Loading

NF-Karlo commented Feb 20, 2025 •

edited

Loading