Skip to main content

Define and run the eval

To log the evaluation results to the Evidently Platform, first connect to Evidently Cloud or your local workspace and create a Project. It’s optional: you can also run evals locally in Python.
1

Prepare the input data

Get your data in a table like a pandas.DataFrame. More on data requirements. You can also load data from Evidently Platform, like tracing data you captured or synthetic datasets.
2

Create a Dataset object

Create a Dataset object with DataDefinition() that specifies column role and types. You can also use default type detection. How to set Data Definition.
eval_data = Dataset.from_pandas(
    source_df,
    data_definition=DataDefinition()
)
3

(Optional) Add descriptors

For LLM and text evals, define row-level descriptors to compute. Here, you can use a variety of methods, from deterministic to LLM judges. Optionally, add row-level tests to get explicit pass/fail outcomes on set conditions. How to use Descriptors.
eval_data.add_descriptors(descriptors=[
    TextLength("Question", alias="Length"),
    Sentiment("Answer", alias="Sentiment")
])
4

Configure Report

For dataset-level evals (classification, data drift) or to summarize descriptors, create a Report with chosen metrics or presets. How to configure Reports.
report = Report([
    DataSummaryPreset()
])
5

(Optional) Add Test conditions

Add dataset-level Pass/Fail conditions, like to check if all texts in the dataset are in < 100 symbols length. How to configure Tests.
report = Report([
    DataSummaryPreset(),
    MaxValue(column="Length", tests=[lt(100)]),
])
6

(Optional) Add Tags and Timestamps

Add tags or metadata to identify specific evaluation runs or datasets, or override the default timestamp . How to add metadata.
7

Run the Report

To execute the eval, runthe Report on the Dataset (or two).
my_eval = report.run(eval_data, None)
8

Explore the results

ws.add_run(project.id, my_eval, include_data=True)
my_eval
##my_eval.json()

Quickstarts

Check for end-to-end examples: