Need help? Ask on Discord.
- evaluating prediction quality (e.g. classification or regression accuracy)
- input data quality (e.g. missing values, out-of-range features)
- data and prediction drift.
1. Set up your environment
For a fully local flow, skip steps 1.1 and 1.3.1.1. Set up Evidently Cloud
- Sign up for a free Evidently Cloud account.
- Create an Organization if you log in for the first time. Get an ID of your organization. (Link).
- Get an API token. Click the Key icon in the left menu. Generate and save the token. (Link).
1.2. Installation and imports
Install the Evidently Python library:1.3. Create a Project
Connect to Evidently Cloud using your API token:2. Prepare a toy dataset
Let’s import a toy dataset with tabular data:Have trouble downloading the data?
Have trouble downloading the data?
If OpenML is not available, you can download the same dataset from here:
Prod
data will include people with education levels unseen in the reference dataset:
Eval_data_2
will be our reference dataset we’ll compare against.
3. Get a Report
Let’s generate a Data Drift preset that will check for statistical distribution changes between all columns in the dataset.You can customize drift parameters by choosing different methods and thresholds. In our case we proceed as is so default tests selected by Evidently will apply.
4. Explore the results
Local preview. In a Python environment like Jupyter notebook or Colab, run:
5. Get a Dashboard (Optional)
As you run repeated evals, you may want to track the results in time by creating a Dashboard. Evidently lets you configure the dashboard in the UI or using dashboards-as-code.
What’s next?
- See available Evidently Metrics: All Metric Table
- Understand how you can add conditional tests to your Reports: Tests.
- Explore options for Dashboard design: Dashboards
Alternatively, try
DataSummaryPreset
that will generate a summary of all columns in the dataset, and run auto-generated Tests to check for data quality and core descriptive stats.