LangSmith Client SDK Implementations
-
Updated
Jun 12, 2024 - Python
LangSmith Client SDK Implementations
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
πͺ’ Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. πYC W23
Documentation for langsmith
A fairly robust mathematics parsing engine for C++ projects.
A task generation and model evaluation system.
The official evaluation suite and dynamic data release for MixEval.
Python client for Kolena's machine learning testing platform
The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.
Test your prompts, agents, and RAGs. Use LLM evals to improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
CyclOps for clinical ML evaluation & monitoring workshop
[ACL 2024 Main] NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism
Python SDK for running evaluations on LLM generated responses
Valor is a centralized evaluation store which makes it easy to measure, explore, and rank model performance.
Evaluation tools for time series machine learning algorithms.
[ACL 2024]CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling
Trajectopy - Trajectory Evaluation in Python
π€ Build AI applications with confidence β DSPy Visualizer β Understand how your users are using your LLM-app β Get a full picture of the quality performance of your LLM-app β Collaborate with your stakeholders in ONE platform β Iterate towards the most valuable & reliable LLM-app.
DevQualityEval: An evaluation benchmark π and framework to compare and evolve the quality of code generation of LLMs.
Add a description, image, and links to the evaluation topic page so that developers can more easily learn about it.
To associate your repository with the evaluation topic, visit your repo's landing page and select "manage topics."