Overview

This page shows the core eval workflow with the Evidently library and links to guides.

Define and run the eval

To log the evaluation results to the Evidently Platform, first connect to Evidently Cloud or your local workspace and create a Project. It’s optional: you can also run evals locally.

Prepare the input data

Get your data in a table like a pandas.DataFrame. More on data requirements. You can also load data from Evidently Platform, like tracing or synthetic datasets.

Create a Dataset object

Create a Dataset object with DataDefinition() that specifies column role and types. You can also use default type detection. How to set Data Definition.

eval_data = Dataset.from_pandas(
    source_df,
    data_definition=DataDefinition()
)

(Optional) Add descriptors

For text evals, choose and compute row-level descriptors. Optionally, add row-level tests to get pass/fail for specific inputs. How to use Descriptors.

eval_data.add_descriptors(descriptors=[
    TextLength("Question", alias="Length"),
    Sentiment("Answer", alias="Sentiment")
])

Configure Report

For dataset-level evals (classification, data drift) or to summarize descriptors, create a Report with chosen metrics or presets. How to configure Reports.

report = Report([
    DataSummaryPreset()
])

(Optional) Add Test conditions

Add dataset-level Pass/Fail conditions, like to check if all texts are in < 100 symbols length. How to configure Tests.

report = Report([
    DataSummaryPreset(),
    MaxValue(column="Length", tests=[lt(100)]),
])

(Optional) Add Tags and Timestamps

Add tags or metadata to identify specific evaluation runs or datasets, or override the default timestamp . How to add metadata.

Run the Report

To execute the eval, runthe Report on the Dataset (or two).

my_eval = report.run(eval_data, None)

Explore the results

To upload to the Evidently Platform. How to upload results.

ws.add_run(project.id, my_eval, include_data=True)

To view locally. All output formats.

my_eval
##my_eval.json()

Quickstarts

Check for end-to-end examples:

LLM quickstart

Evaluate the quality of text outputs.

ML quickstart

Test tabular data quality and data drift.

Get Started

Setup

Evaluation library

Platform

Define and run the eval

Quickstarts

LLM quickstart

ML quickstart

Get Started

Setup

Evaluation library

Platform

​Define and run the eval

​Quickstarts

LLM quickstart

ML quickstart

Define and run the eval

Quickstarts