Skip to content

Commit

Permalink
feat: add bedrock anthropic for token usage counting
Browse files Browse the repository at this point in the history
feat: update

fix: fix spell

docs: remove unnecessary args from the pip install (langchain-ai#19823)

**Description:** An additional `U` argument was added for the
instructions to install the pip packages for the MediaWiki Dump Document
loader which was leading to error in installing the package. Removing
the argument fixed the command to install.

**Issue:** langchain-ai#19820
**Dependencies:** No dependency change requierd
**Twitter handle:** [@vardhaman722](https://twitter.com/vardhaman722)

Update cross_encoder_reranker.ipynb (langchain-ai#19846)

HuggingFace -> Hugging Face

core: generate mermaid syntax and render visual graph (langchain-ai#19599)

- **Description:** Add functionality to generate Mermaid syntax and
render flowcharts from graph data. This includes support for custom node
colors and edge curve styles, as well as the ability to export the
generated graphs to PNG images using either the Mermaid.INK API or
Pyppeteer for local rendering.
- **Dependencies:** Optional dependencies are `pyppeteer` if rendering
wants to be done using Pypeteer and Javascript code.

---------

Co-authored-by: Angel Igareta <angel.igareta@klarna.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>

feat: fix

feat: update

fix: fix lint

fix: fix lint

fix: fix lint

ai21[patch]: release 0.1.3 (langchain-ai#19867)

👥 Update LangChain people data (langchain-ai#19858)

👥 Update LangChain people data

Co-authored-by: github-actions <github-actions@github.com>

community[patch]: Revert " Fix the bug that Chroma does not specify `e… (langchain-ai#19866)

…mbedding_function` (langchain-ai#19277)"

This reverts commit 7042934.

Fixes langchain-ai#19848

fix: fix lint
  • Loading branch information
Sukitly committed Apr 1, 2024
1 parent 003c98e commit ffb2ccb
Show file tree
Hide file tree
Showing 13 changed files with 1,530 additions and 558 deletions.
1,422 changes: 886 additions & 536 deletions docs/data/people.yml

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions docs/docs/integrations/document_loaders/mediawikidump.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@
"outputs": [],
"source": [
"# mediawiki-utilities supports XML schema 0.11 in unmerged branches\n",
"%pip install --upgrade --quiet U git+https://github.com/mediawiki-utilities/python-mwtypes@updates_schema_0.11\n",
"%pip install --upgrade --quiet git+https://github.com/mediawiki-utilities/python-mwtypes@updates_schema_0.11\n",
"# mediawiki-utilities mwxml has a bug, fix PR pending\n",
"%pip install --upgrade --quiet U git+https://github.com/gdedrouas/python-mwxml@xml_format_0.11\n",
"%pip install --upgrade --quiet U mwparserfromhell"
"%pip install --upgrade --quiet git+https://github.com/gdedrouas/python-mwxml@xml_format_0.11\n",
"%pip install --upgrade --quiet mwparserfromhell"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@
"source": [
"# Cross Encoder Reranker\n",
"\n",
"This notebook shows how to implement reranker in a retriever with your own cross encoder from [HuggingFace cross encoder models](https://huggingface.co/cross-encoder) or HuggingFace models that implements cross encoder function ([example: BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)). `SagemakerEndpointCrossEncoder` enables you to use these HuggingFace models loaded on Sagemaker.\n",
"This notebook shows how to implement reranker in a retriever with your own cross encoder from [Hugging Face cross encoder models](https://huggingface.co/cross-encoder) or Hugging Face models that implements cross encoder function ([example: BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)). `SagemakerEndpointCrossEncoder` enables you to use these HuggingFace models loaded on Sagemaker.\n",
"\n",
"This builds on top of ideas in the [ContextualCompressionRetriever](/docs/modules/data_connection/retrievers/contextual_compression/). Overall structure of this document came from [Cohere Reranker documentation](/docs/integrations/retrievers/cohere-reranker.ipynb).\n",
"\n",
"For more about why cross encoder can be used as reranking mechanism in conjunction with embeddings for better retrieval, refer to [HuggingFace Cross-Encoders documentation](https://www.sbert.net/examples/applications/cross-encoder/README.html)."
"For more about why cross encoder can be used as reranking mechanism in conjunction with embeddings for better retrieval, refer to [Hugging Face Cross-Encoders documentation](https://www.sbert.net/examples/applications/cross-encoder/README.html)."
]
},
{
Expand Down Expand Up @@ -173,11 +173,11 @@
"id": "419a2bf3-de4b-4c4d-9a40-4336552f604c",
"metadata": {},
"source": [
"## Uploading HuggingFace model to SageMaker endpoint\n",
"## Uploading Hugging Face model to SageMaker endpoint\n",
"\n",
"Refer to [this article](https://www.philschmid.de/custom-inference-huggingface-sagemaker) for general guideline. Here is a simple `inference.py` for creating an endpoint that works with `SagemakerEndpointCrossEncoder`.\n",
"\n",
"It downloads HuggingFace model on the fly, so you do not need to keep the model artifacts such as `pytorch_model.bin` in your `model.tar.gz`."
"It downloads Hugging Face model on the fly, so you do not need to keep the model artifacts such as `pytorch_model.bin` in your `model.tar.gz`."
]
},
{
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
import threading
from typing import Any, Dict, List, Union

from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.outputs import LLMResult

MODEL_COST_PER_1K_INPUT_TOKENS = {
"anthropic.claude-instant-v1": 0.0008,
"anthropic.claude-v2": 0.008,
"anthropic.claude-v2:1": 0.008,
"anthropic.claude-3-sonnet-20240229-v1:0": 0.003,
"anthropic.claude-3-haiku-20240307-v1:0": 0.00025,
}

MODEL_COST_PER_1K_OUTPUT_TOKENS = {
"anthropic.claude-instant-v1": 0.0024,
"anthropic.claude-v2": 0.024,
"anthropic.claude-v2:1": 0.024,
"anthropic.claude-3-sonnet-20240229-v1:0": 0.015,
"anthropic.claude-3-haiku-20240307-v1:0": 0.00125,
}


def _get_anthropic_claude_token_cost(
prompt_tokens: int, completion_tokens: int, model_id: Union[str, None]
) -> float:
"""Get the cost of tokens for the Claude model."""
if not model_id:
raise ValueError("Model name is required to calculate cost.")
return (prompt_tokens / 1000) * MODEL_COST_PER_1K_INPUT_TOKENS[model_id] + (
completion_tokens / 1000
) * MODEL_COST_PER_1K_OUTPUT_TOKENS[model_id]


class BedrockAnthropicTokenUsageCallbackHandler(BaseCallbackHandler):
"""Callback Handler that tracks bedrock anthropic info."""

total_tokens: int = 0
prompt_tokens: int = 0
completion_tokens: int = 0
successful_requests: int = 0
total_cost: float = 0.0

def __init__(self) -> None:
super().__init__()
self._lock = threading.Lock()

def __repr__(self) -> str:
return (
f"Tokens Used: {self.total_tokens}\n"
f"\tPrompt Tokens: {self.prompt_tokens}\n"
f"\tCompletion Tokens: {self.completion_tokens}\n"
f"Successful Requests: {self.successful_requests}\n"
f"Total Cost (USD): ${self.total_cost}"
)

@property
def always_verbose(self) -> bool:
"""Whether to call verbose callbacks even if verbose is False."""
return True

def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
) -> None:
"""Print out the prompts."""
pass

def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
"""Print out the token."""
pass

def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Collect token usage."""
if response.llm_output is None:
return None

if "usage" not in response.llm_output:
with self._lock:
self.successful_requests += 1
return None

# compute tokens and cost for this request
token_usage = response.llm_output["usage"]
completion_tokens = token_usage.get("completion_tokens", 0)
prompt_tokens = token_usage.get("prompt_tokens", 0)
total_tokens = token_usage.get("total_tokens", 0)
model_id = response.llm_output.get("model_id", None)
total_cost = _get_anthropic_claude_token_cost(
prompt_tokens=prompt_tokens,
completion_tokens=completion_tokens,
model_id=model_id,
)

# update shared state behind lock
with self._lock:
self.total_cost += total_cost
self.total_tokens += total_tokens
self.prompt_tokens += prompt_tokens
self.completion_tokens += completion_tokens
self.successful_requests += 1

def __copy__(self) -> "BedrockAnthropicTokenUsageCallbackHandler":
"""Return a copy of the callback handler."""
return self

def __deepcopy__(self, memo: Any) -> "BedrockAnthropicTokenUsageCallbackHandler":
"""Return a deep copy of the callback handler."""
return self
29 changes: 28 additions & 1 deletion libs/community/langchain_community/callbacks/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@

from langchain_core.tracers.context import register_configure_hook

from langchain_community.callbacks.bedrock_anthropic_callback import (
BedrockAnthropicTokenUsageCallbackHandler,
)
from langchain_community.callbacks.openai_info import OpenAICallbackHandler
from langchain_community.callbacks.tracers.comet import CometTracer
from langchain_community.callbacks.tracers.wandb import WandbTracer
Expand All @@ -19,14 +22,18 @@
openai_callback_var: ContextVar[Optional[OpenAICallbackHandler]] = ContextVar(
"openai_callback", default=None
)
wandb_tracing_callback_var: ContextVar[Optional[WandbTracer]] = ContextVar( # noqa: E501
bedrock_anthropic_callback_var: (ContextVar)[
Optional[BedrockAnthropicTokenUsageCallbackHandler]
] = ContextVar("bedrock_anthropic_callback", default=None)
wandb_tracing_callback_var: ContextVar[Optional[WandbTracer]] = ContextVar(
"tracing_wandb_callback", default=None
)
comet_tracing_callback_var: ContextVar[Optional[CometTracer]] = ContextVar( # noqa: E501
"tracing_comet_callback", default=None
)

register_configure_hook(openai_callback_var, True)
register_configure_hook(bedrock_anthropic_callback_var, True)
register_configure_hook(
wandb_tracing_callback_var, True, WandbTracer, "LANGCHAIN_WANDB_TRACING"
)
Expand All @@ -53,6 +60,26 @@ def get_openai_callback() -> Generator[OpenAICallbackHandler, None, None]:
openai_callback_var.set(None)


@contextmanager
def get_bedrock_anthropic_callback() -> \
Generator[BedrockAnthropicTokenUsageCallbackHandler, None, None]:
"""Get the Bedrock anthropic callback handler in a context manager.
which conveniently exposes token and cost information.
Returns:
BedrockAnthropicTokenUsageCallbackHandler:
The Bedrock anthropic callback handler.
Example:
>>> with get_bedrock_anthropic_callback() as cb:
... # Use the Bedrock anthropic callback handler
"""
cb = BedrockAnthropicTokenUsageCallbackHandler()
bedrock_anthropic_callback_var.set(cb)
yield cb
bedrock_anthropic_callback_var.set(None)


@contextmanager
def wandb_tracing_enabled(
session_name: str = "default",
Expand Down
2 changes: 1 addition & 1 deletion libs/community/langchain_community/chat_models/bedrock.py
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@ def _combine_llm_outputs(self, llm_outputs: List[Optional[dict]]) -> dict:
final_output = {}
for output in llm_outputs:
output = output or {}
usage = output.pop("usage", {})
usage = output.get("usage", {})
for token_type, token_count in usage.items():
final_usage[token_type] += token_count
final_output.update(output)
Expand Down
8 changes: 2 additions & 6 deletions libs/community/langchain_community/vectorstores/chroma.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@
import chromadb.config
from chromadb.api.types import ID, OneOrMany, Where, WhereDocument


logger = logging.getLogger()
DEFAULT_K = 4 # Number of Documents to return.

Expand Down Expand Up @@ -81,7 +80,6 @@ def __init__(
try:
import chromadb
import chromadb.config
from chromadb.utils import embedding_functions
except ImportError:
raise ImportError(
"Could not import chromadb python package. "
Expand Down Expand Up @@ -124,12 +122,10 @@ def __init__(
_client_settings.persist_directory or persist_directory
)

self._embedding_function = (
embedding_function or embedding_functions.DefaultEmbeddingFunction()
)
self._embedding_function = embedding_function
self._collection = self._client.get_or_create_collection(
name=collection_name,
embedding_function=self._embedding_function,
embedding_function=None,
metadata=collection_metadata,
)
self.override_relevance_score_fn = relevance_score_fn
Expand Down
32 changes: 32 additions & 0 deletions libs/community/tests/unit_tests/callbacks/test_callback_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from langchain_core.tracers.langchain import LangChainTracer, wait_for_all_tracers

from langchain_community.callbacks import get_openai_callback
from langchain_community.callbacks.manager import get_bedrock_anthropic_callback
from langchain_community.llms.openai import BaseOpenAI


Expand Down Expand Up @@ -77,6 +78,37 @@ def test_callback_manager_configure_context_vars(
)
mngr.on_llm_start({}, ["prompt"])[0].on_llm_end(response)

# The callback handler has been updated
assert cb.successful_requests == 1
assert cb.total_tokens == 3
assert cb.prompt_tokens == 2
assert cb.completion_tokens == 1
assert cb.total_cost > 0

with get_bedrock_anthropic_callback() as cb:
# This is a new empty callback handler
assert cb.successful_requests == 0
assert cb.total_tokens == 0

# configure adds this bedrock anthropic cb,
# but doesn't modify the group manager
mngr = CallbackManager.configure(group_manager)
assert mngr.handlers == [tracer, cb]
assert group_manager.handlers == [tracer]

response = LLMResult(
generations=[],
llm_output={
"usage": {
"prompt_tokens": 2,
"completion_tokens": 1,
"total_tokens": 3,
},
"model_id": "anthropic.claude-instant-v1"
},
)
mngr.on_llm_start({}, ["prompt"])[0].on_llm_end(response)

# The callback handler has been updated
assert cb.successful_requests == 1
assert cb.total_tokens == 3
Expand Down

0 comments on commit ffb2ccb

Please sign in to comment.