Report
How to generate Report.
Reports perform evaluations on the Dataset level and/or summarize results of the row-level evaluations. For a general introduction, check Core Concepts.
Pre-requisites:
-
You installed Evidently.
-
You created a Dataset with the Data Definition.
-
(Optional) for text data, you added Descriptors.
Imports
Import the Metrics and Presets you plan to use.
You can use Metric Presets, which are pre-built Reports that work out of the box, or create a custom Report selecting Metrics one by one.
Presets
Available Presets. Check available evals in the Reference table.
To generate a template Report, simply pass the selected Preset to the Report and run it over your data. If nothing else is specified, the Report will run with the default parameters for all columns in the dataset.
Single dataset. To generate the Data Summary Report for a single dataset:
After you run
the Report, the resulting my_eval
will contains the computed values for each metric, along with associated metadata and visualizations. (We sometimes refer to this computation result as a snapshot
).
You can render the results in Python, export as HTML, JSON or Python dictionary or upload to the Evidently platform. Check more in output formats.
Two datasets. To generate reports like Data Drift that needs two datasets, pass the second one as a reference when you run
it:
In this case the first eval_data_1
is the current data you evaluate, the second eval_data_2
is the reference dataset you consider as a baseline for drift detection. You can also pass it explicitly:
Combine Presets. You can also include multiple Presets in the same Report. List them one by one.
Limit columns. You can limit the columns to which the Preset is applied.
Custom Report
Available Metrics and parameters. Check available evals in the Reference table.
Choose Metrics. To create a custom Report, simply list the Metics one by one. You can combine both dataset-level and column-level Metrics, and combine Presets and Metrics in one Report. When you use a column-level Metric, you must specify the column it refers to.
Generating multiple column-level Metrics: You can use a helper function to easily generate multiple column-level Metrics for a list of columns. See the page on Metric Generator.
Metric Parameters. Metrics can have optional or required parameters.
For example, the data drift detection algorithm automatically selects a method, but you can override this by specifying your preferred method (Optional).
To calculate the Precision at K for a ranking task, you must always pass the k
parameter (Required).
Compare results
If you computed multiple snapshots, you can quickly compare the resulting metrics side-by-side in a dataframe:
Group by
You can calculate metrics separately for different groups in your data, using a column with categories to split by. Use the GroupyBy
metric as shown below.
Example. This will compute the maximum value of salaries by each label in the “Department” column.
Note: you cannot use auto-generated Test conditions when you use GroupBy.
What’s next?
You can also add conditions to Metrics: check the Tests guide.