Confident

DeepEval package for unit testing LLMs.

In this guide we will demonstrate how to test and measure LLMs in performance. We show how you can use our callback to measure performance and how you can define your own metric and log them into our dashboard.

DeepEval also offers:

How to generate synthetic data
How to measure performance
A dashboard to monitor and review results over time

Installation and Setup

!pip install deepeval langchain langchain-openai

Getting API Credentials

To get the DeepEval API credentials, follow the next steps:

Go to https://app.confident-ai.com
Click on "Organization"
Copy the API Key.

When you log in, you will also be asked to set the implementation name. The implementation name is required to describe the type of implementation. (Think of what you want to call your project. We recommend making it descriptive.)

import os
import deepeval

api_key = os.getenv("DEEPEVAL_API_KEY")
deepeval.login(api_key)

<pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace">🎉🥳 Congratulations! You've successfully logged in! 🙌 
</pre>
 

Setup Confident AI Callback (Modern)

The previous DeepEvalCallbackHandler and metric tracking are deprecated. Please use the new integration below.

from deepeval.integrations.langchain import CallbackHandler

handler = CallbackHandler(
    name="My Trace",
    tags=["production", "v1"],
    metadata={"experiment": "A/B"},
    thread_id="thread-123",
    user_id="user-456",
)

Installation and Setup​

Getting API Credentials​

Setup Confident AI Callback (Modern)​

Get Started​

Installation and Setup

Getting API Credentials

Setup Confident AI Callback (Modern)

Get Started