- You know how to generate Reports and select Metrics.
Imports
To use Tests, import the following modules:Auto-generated conditions
There are 3 ways to run conditional checks:- Tests Presets. Get a suite of pre-selected Tests with auto-generated conditions.
- Tests with defaults. Pick Tests one by one, with auto-generate conditions.
- Custom Tests. Choose all Tests and set conditions manually.
Test Presets
Test Presets automatically generate a set of Tests to evaluate your data or AI system. Each Report Preset has this option. Enable it by settinginclude_tests=True on the Report level. (Default: False).
DataSummaryPreset() Report simply shows descriptive stats of your data, adding the Tests will additionally run multiple checks on data quality and expected column statistics.
The automatic Test conditions can either
- be derived from a reference dataset, or
- use built-in heuristics.
Note that in this case the order matters: the first
eval_data_1 is the current data you evaluate, the second eval_data_2 is the reference dataset you consider as a baseline and use to generate test conditions.How to check Test defaults? Consult the All Metrics reference table.
Individual Tests with defaults
Presets are great for a start or quick sanity checks, but often you’d want to select specific Tests. For example, instead of running checks on all value statistics, validate only mean or max. You can pick the Tests while still using default conditions. Select Tests. List the individual Metrics, and choose the theinclude_Tests option:
tests to None or leave empty:
MinValue() with auto-generated conditions.
Custom Test conditions
You can define specific pass/fail conditions for each Test. For example, set minimum expected precision or share of a certain category. Tests fail when conditions aren’t met. Setting conditions. For each Metric you want to validate, define a list oftests and set expected behavior using parameters like gt (greater than), lt (less than), eq (equal).
For example, to verify that there are no missing values and no values below 18 in the “Age” column:
include_tests when setting Tests manually.
Sometimes you may need to use other parameters to set test conditions. The
tests parameter applies when a metric returns a single value, or to test count for metrics that return both count and share. For metrics with multiple outputs (e.g. MAE returns mean and std), you may need to use specific test parameters like mean_tests and std_tests. You can check metric outputs at the All Metric page.Test parameters
Here are the conditions you can set:| Condition | Explanation | Example |
|---|---|---|
eq(val) | equal to test_result == val | MinValue(column="Age", tests=[eq(18)]) |
not_eq(val) | not equal test_result != val | MinValue(column="Age", tests=[not_eq(18)]) |
gt(val) | greater than test_result > val | MinValue(column="Age", tests=[gt(18)]) |
gte(val) | greater than or equal test_result >= val | MinValue(column="Age", tests=[gte(18)]) |
lt(val) | less than test_result < val | MinValue(column="Age", tests=[lt(18)]) |
lte(val) | less than or equal test_result <= val | MinValue(column="Age", tests=[lte(18)]) |
is_in: list | test_result == one of the values | MinValue(column="Age", tests=[is_in([18, 21, 30])]) |
not_in: list | test_result != any of the values | MinValue(column="Age", tests=[not_in([16, 17, 18])]) |
How to check available parameters? Consult the All Metrics reference table.
include_tests=True and adding custom conditions where needed.
MissingValueCount or CategoryCount return both absolute counts and percentage. The default tests parameter lets you set condition against the absolute value. To test the relative value, use share_tests parameter.
To test for fewer than 5 missing values (absolute):
Tests relative to reference
Testing against reference. If you pass a reference dataset, you can set conditions relative to the reference values. For example, to Test that the number of rows in the current dataset is equal or greater than the reference number of rows +/- 10%:Set Test criticality
By default, failed Tests return Fail. To get a Warning instead, setis_critical=False: