latest
Search…
⌃K
Links

All tests

List of all tests and test presets available in Evidently.
How to use this page
We are doing our best to maintain this page up to date. In case of discrepancies, consult the API reference or the current version of the "All tests" example notebook in the Examples section. If you notice an error, please send us a pull request to update the documentation!

Test Presets

Default conditions for each test in the preset match the test's defaults. You can see them in the following sections on this page.
Preset name and Description
Parameters
NoTargetPerformanceTestPreset
  • TestShareOfDriftedColumns()
  • TestColumnDrift(column_name=prediction)
  • TestColumnDrift(column_name=column_name) for сolumns if provided
  • TestColumnsType()
  • TestColumnShareOfMissingValues(column_name=column_name) for all or сolumns if provided
  • TestShareOfOutRangeValues(column_name=column_name) for all numerical_columns or among columns if provided
  • TestShareOfOutListValues(column_name=column_name) for all categorical_columns or among columns if provided
  • TestMeanInNSigmas(column_name=column_name, n=2) for all numerical_columns or among columns if provided
Optional:
  • columns
  • stattest
  • cat_stattest
  • num_stattest
  • per_column_stattest
  • stattest_threshold
  • cat_stattest_threshold
  • num_stattest_threshold
  • per_column_stattest_threshold
  • drift_share
DataStabilityTestPreset
  • TestNumberOfRows()
  • TestNumberOfColumns()
  • TestColumnsType()
  • TestColumnShareOfMissingValues()
  • TestShareOfOutRangeValues(column_name=column_name) for all numerical_columns or among columns if provided
  • TestShareOfOutListValues(column_name=column_name) for all categorical_columns or among columns if provided
  • TestMeanInNSigmas(column_name=column_name, n=2) for all numerical_columns or among columns if provided
Optional:
  • columns
DataQualityTestPreset
  • TestColumnShareOfMissingValues(column_name=column_name) for all or columns
  • TestMostCommonValueShare(column_name=column_name) for all or columns
  • TestNumberOfConstantColumns()
  • TestNumberOfDuplicatedColumns()
  • TestNumberOfDuplicatedRows()
  • TestHighlyCorrelatedColumns()
Optional:
  • columns
**DataDriftTestPreset **
  • TestShareOfDriftedColumns()
  • TestColumnDrift(column_name=column_name) for all or сolumns if provided
Optional:
  • columns
  • stattest
  • cat_stattest
  • num_stattest
  • per_column_stattest
  • stattest_threshold
  • cat_stattest_threshold
  • num_stattest_threshold
  • per_column_stattest_threshold
RegressionTestPreset
  • TestValueMeanError()
  • TestValueMAE()
  • TestValueRMSE()
  • TestValueMAPE()
N/A
MulticlassClassificationTestPreset
  • TestAccuracyScore()
  • TestF1Score()
  • TestPrecisionByClass()
  • TestRecallByClass()
  • TestColumnDrift(column_name=target)
  • TestNumberOfRows()
If probabilistic classification, also:
  • TestLogLoss()
  • TestRocAuc()
Optional:
  • stattest
  • stattest_threshold
BinaryClassificationTopKTestPreset
  • TestAccuracyScore(k=k)
  • TestPrecisionScore(k=k)
  • TestRecallScore(k=k)
  • TestF1Score(k=k)
  • TestColumnDrift(column_name=target)
  • TestRocAuc()
  • TestLogLoss()
Required:
  • k
Optional:
  • stattest
  • stattest_threshold
  • probas_threshold
BinaryClassificationTestPreset
  • TestColumnDrift(column_name=target)
  • TestPrecisionScore()
  • TestRecallScore()
  • TestF1Score()
  • TestAccuracyScore()
If probabilistic classification, also:
  • TestRocAuc()
Optional:
  • stattest
  • stattest_threshold
  • probas_threshold

Data Integrity

Defaults for Data Integrity. If there is no reference data or defined conditions, data integrity will be checked against a set of heuristics.
If you pass the reference data, Evidently will automatically derive all relevant statistics (e.g., number of columns, rows, share of missing values etc.) and apply default test conditions. You can also pass custom test conditions.
Defaults for Missing Values. The metrics that calculate the number or share of missing values detect four types of the values by default: Pandas nulls (None, NAN, etc.), "" (empty string), Numpy "-inf" value, Numpy "inf" value. You can also pass a custom missing values as a parameter and specify if you want to replace the default list. Example:
TestNumberOfMissingValues(missing_values=["", 0, "n/a", -9999, None], replace=True)
Test name
Description
Parameters
Default test condition
TestNumberOfRows()
Dataset-level. Tests the number of rows against the reference or a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects +/-10% or >30. With reference: the test fails if the number of rows differs by over 10% from the reference. No reference: the test fails if the number of rows is <= 30.
TestNumberOfColumns()
Dataset-level. Tests the number of columns against the reference or a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects the same or non-zero. With reference: the test fails if the number of columns differs from the reference. No reference: the test fails if the number of columns is 0.
TestNumberOfMissingValues()
Dataset-level. Tests the number of missing values in the dataset against the reference or a defined condition.
Required: N/A Optional:
  • missing_values = [], replace = True/False (default = default list)
Test conditions:
  • standard parameters
Expects up to +10% or 0. With reference: the test fails if the share of missing values is over 10% higher than in reference. No reference: the test fails if the dataset contains missing values.
TestShareOfMissingValues()
Dataset-level. Tests the share of missing values in the dataset against the reference or a defined condition.
Required: N/A Optional:
  • missing_values = [], replace = True/False (default = default list)
Test conditions:
  • standard parameters
Expects up to +10% or 0. With reference: the test fails if the share of missing values is over 10% higher than in reference. No reference: the test fails if the dataset contains missing values.
TestNumberOfColumnsWithMissingValues()
Dataset-level. Tests the number of columns that contain missing values in the dataset against the reference or a defined condition.
Required: N/A Optional:
  • missing_values = [], replace = True/False (default = default list)
Test conditions:
  • standard parameters
Expects <= or 0. With reference: the test fails if the number of columns with missing values is higher than in reference. No reference: the test fails if the dataset contains columns with missing values.
TestShareOfColumnsWithMissingValues()
Dataset-level. Tests the share of columns that contain missing values in the dataset against the reference or a defined condition.
Required: N/A Optional:
  • missing_values = [], replace = True/False (default = default list)
Test conditions:
  • standard parameters
Expects <= or 0. With reference: the test fails if the share of columns with missing values is higher than in reference. No reference: the test fails if the dataset contains columns with missing values.
TestNumberOfRowsWithMissingValues()
Dataset-level. Tests the number of rows that contain missing values against the reference or a defined condition.
Required: N/A Optional:
  • missing_values = [], replace = True/False (default = default list)
Test conditions
  • standard parameters
Expects up to +10% or 0. With reference: the test fails if the share of rows with missing values is over 10% higher than in reference. No reference: the test fails if the dataset contains rows with missing values.
TestShareOfRowsWithMissingValues()
Dataset-level. Tests the share of rows that contain missing values against the reference or a defined condition.
Required: N/A Optional:
  • missing_values = [], replace = True/False (default = default list)
Test conditions
  • standard parameters
Expects up to +10% or 0. With reference: the test fails if the share of rows with missing values is over 10% higher than in reference. No reference: the test fails if the dataset contains rows with missing values.
TestNumberOfDifferentMissingValues()
Dataset-level. Tests the number of differently encoded missing values in the dataset against the reference or a defined condition. Detects 4 types of missing values by default and/or values from a user list.
Required: N/A Optional:
  • missing_values: list <br>replace: bool = True(default = default list)
Test conditions
  • standard parameters
Expects <= or none. With reference: the test fails if the current dataset has more types of missing values. No reference: the test fails if the current dataset contains missing values.
TestNumberOfConstantColumns()
Dataset-level. Tests the number of columns with all constant values against reference or a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects =< or none. With reference: the test fails if the number of constant columns is higher than in the reference. No reference: the test fails if there is at least one constant column.
TestNumberOfEmptyRows()
Dataset-level. Tests the number of empty rows against reference or a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects +/- 10% or none. With reference: the test fails if the share of empty rows is over 10% higher or lower than in the reference. No reference: the test fails if there is at least one empty row.
TestNumberOfEmptyColumns()
Dataset-level. Tests the number of empty columns against reference or a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects =< or none. With reference: the test fails if the number of empty columns is higher than in the reference. No reference: the test fails if there is at least one empty column.
TestNumberOfDuplicatedRows()
Dataset-level. Tests the number of duplicate rows against reference or a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects +/- 10% or none. With reference: the test fails if the share of duplicate rows is over 10% higher or lower than in the reference. No reference: the test fails if there is at least one duplicate row.
TestNumberOfDuplicatedColumns()
Dataset-level. Tests the number of duplicate columns against reference or a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects =< or none. With reference: the test fails if the number of duplicate columns is higher than in the reference. No reference: the test fails if there is at least one duplicate column.
TestColumnsType()
Dataset-level. Tests the types of all columns against the reference.
Required: N/A Optional: columns_type: dict Test conditions: N/A
Expects types to match. With reference: the test fails if at least one column type does not match. No reference: N/A
TestColumnNumberOfMissingValues(column_name='name')
Column-level. Tests the number of missing values in a given column against the reference or a defined condition.
Required:
  • column_name
Optional:
  • missing_values = [], replace = True/False (default = default list)
Test conditions
  • standard parameters
Expects up to 10% or none. With reference: the test fails if the share of missing values in a column is over 10% higher than in reference. No reference: the test fails if the column contains missing values.
TestColumnShareOfMissingValues(column_name='name')
Column-level. Tests the share of missing values in a given column against the reference or a defined condition.
Required:
  • column_name
Optional:
  • missing_values = [], replace = True/False (default = default list)
Test conditions
  • standard parameters
Expects up to 10% or none. With reference: the test fails if the share of missing values in a column is over 10% higher than in reference. No reference: the test fails if the column contains missing values.
TestColumnNumberOfDifferentMissingValues(column_name='name')
Column-level. Tests the number of differently encoded missing values in the column against reference or a defined condition. Detects 4 types of missing values by default and/or values from a user list.
Required:
  • column_name
Optional:
  • missing_values = [], replace = True/False (default = default list)
Test conditions:
  • standard parameters
Expects <= or none. With reference: the test fails if the current column has more types of missing values. No reference: The test fails if the column contains missing values.
TestColumnAllConstantValues(column_name='name')
Column-level. Tests if all the values in a given column are constant.
Required:
  • column_name
Optional: N/A Test conditions: N/A
Expects non-constant. The test fails if all values in a given column are constant.
TestColumnAllUniqueValues(column_name='name')
Column-level. Tests if all the values in a given column are unique.
Required:
  • column_name
Optional: N/A Test conditions: N/A
Expects all unique (e.g., IDs). The test fails if at least one value in a given column is not unique.
TestColumnRegExp(column_name='name, reg_exp='^[0..9]')
Column-level. Tests the number of values in a column that do not match a defined regular expression, against reference or a defined condition.
Required:
  • column_name
  • reg_exp
Optional: N/A Test conditions:
  • standard parameters
With reference: the test fails if the share of values that match a regular expression is over 10% higher or lower than in the reference. No reference: the test fails if at least one of the values does not match a regular expression.

Data Quality

Defaults for data quality. If there is no reference data or defined conditions, data quality will be checked against a set of heuristics.
If you pass the reference data, Evidently will automatically derive all relevant statistics (e.g., min value, max value, value range, value list, etc.) and apply default test conditions. You can also pass custom test conditions.
Test name
Description
Parameters
Default test conditions
TestConflictTarget()
Dataset-level. Tests if there are conflicts in the target (instances where a different label is assigned for an identical input).
N/A
Expects no conflicts in the target (with or without reference).
TestConflictPrediction()
Dataset-level. Tests if there are conflicts in the prediction (instances where a different prediction is made for an identical input).
N/A
Expects no conflicts in the target (with or without reference).
TestTargetPredictionCorrelation()
Dataset-level. Tests the strength of correlation between the target and prediction.
Required: N/A Optional:
  • method (default = pearson, available = pearson, spearman, kendall, cramer_v)
Test conditions:
  • standard parameters
Expects +/- 0.25 in correlation strength, or > 0. With reference: the test fails if there is a 0.25+ change in the correlation strength between target and prediction. No reference: the test fails if the correlation between target and prediction <=0
TestHighlyCorrelatedColumns()
Dataset-level. Tests the strongest correlation between a pair of features, against reference or a defined condition.
Required: N/A Optional:
  • method (default = pearson, available = pearson, spearman, kendall, cramer_v)
Test conditions:
  • standard parameters
Expects +/- 10% in max correlation strength, or < 0.9. With reference: the test fails if there is a 10%+ change in the correlation strength for the most correlated feature pair. No reference: the test fails if there is at least one pair of features with the correlation >= 0.9
TestTargetFeaturesCorrelations()
Dataset-level. Tests if any of the features is highly correlated with the target. Example use: to detect target leak.
Required: N/A Optional:
  • 'method (default = pearson, available = pearson, spearman, kendall, cramer_v)
Test conditions:
  • standard parameters
Expects +/- 10% in max correlation strength, or < 0.9. With reference: the test fails if there is a 10%+ change in the correlation strength for the feature most correlated with the target. No reference: the test fails if at least one feature is correlated with the target >= 0.9
TestPredictionFeaturesCorrelations()
Dataset-level. Tests if any of the features is highly correlated with the prediction Example use: to detect when predictions rely on a single feature.
Required: N/A Optional:
  • method (default = pearson, available = pearson, spearman, kendall, cramer_v)
Test conditions:
  • standard parameters
Expects +/- 10% in max correlation strength, or < 0.9. With reference: the test fails if there is a 10%+ change in the correlation strength for the feature most correlated with the prediction. No reference: the test fails if at least one feature is correlated with the prediction >= 0.9
TestCorrelationChanges()
Dataset-level. Tests the number of correlation violations (significant change in the correlation strength between the two features).
Required: N/A Optional:
  • method (default = pearson, available = pearson, spearman, kendall, cramer_v)
  • corr_diff (default = 0.25)
Test conditions:
  • standard parameters
Expects none. With reference: the test fails if at least 1 correlation violation is detected. No reference: N/A
TestColumnValueMin(column_name='num-column')
Column-level. Tests the minimum value of a given numerical column against reference or a defined condition.
Required:
  • column_name
Optional: N/A Test conditions:
  • standard parameters
Expects not lower. With reference: the test fails if the minimum value is lower than in the reference. No reference: N/A
TestColumnValueMax(column_name='num-column')
Column-level. Tests the maximum value of a given numerical column against reference or a defined condition.
Required:
  • column_name
Optional: N/A Test conditions:
  • standard parameters
Expects not higher. With reference: the test fails if the maximum value is higher than in the reference. No reference: N/A
TestColumnValueMean(column_name='num-column')
Column-level. Tests the mean value of a given numerical column against reference or a defined condition.
Required:
  • column_name
Optional: N/A Test conditions:
  • standard parameters
Expects +/-10%. With reference: the test fails if the mean value is different by more than 10%. No reference: N/A
TestColumnValueMedian(column_name='num-column')
Column-level. Tests the median value of a given numerical column against reference or a defined condition.
Required:
  • column_name
Optional: N/A Test conditions:
  • standard parameters
Expects +/-10%. With reference: the test fails if the median value is different by more than 10%. No reference: N/A
TestColumnValueStd(column_name='num-column')
Column-level. Tests the standard deviation of a given numerical column against reference or a defined condition.
Required:
  • column_name
Optional: N/A Test conditions:
  • standard parameters
Expects +/-10%. With reference: the test fails if the standard deviation is different by more than 10%. No reference: N/A
TestNumberOfUniqueValues(column_name='name')
Column-level. Tests the number of unique values in a given column against reference or a defined condition.
Required:
  • column_name
Optional: N/A Test conditions:
  • standard parameters
Expects +/-10%. With reference: the test fails if the share of unique values is different by more than 10%. No reference: N/A
TestUniqueValuesShare(column_name='name')
Column-level. Tests the share of unique values in a given column against reference or a defined condition.
Required:
  • column_name
Optional: N/A Test conditions:
  • standard parameters
Expects +/-10%. With reference: the test fails if the share of unique values is different by more than 10%. No reference: N/A
TestMostCommonValueShare(column_name='name')
Column-level. Tests the share of the most common value in a given column against reference or a defined condition.
Required:
  • column_name
Optional: N/A Test conditions:
  • standard parameters
Expects +/-10%. With reference: the test fails if the share of the most common value is different by more than 10% from the reference. No reference: the test fails if the share of the most common value is >= 80%.
TestMeanInNSigmas(column_name='num-column')
Column-level. Tests if the mean value in a given numerical column is within the expected range , defined in standard deviations. This test requires reference.
Required:
  • column_name
Optional:
  • n_sigmas
Expects +/- 2 std dev. With reference: the test fails if the current mean value is out of the +/- 2 std dev interval from the reference mean value. No reference: N/A
TestValueRange(column_name='num_column')
Column-level. Tests if a numerical column contains values out of the min-max range.
Required:
  • column_name
Optional:
  • left
  • right
Test conditions: N/A
Expects all values to be in range. With reference: the test fails if the column contains values out of the min-max range as seen in the reference. No reference: N/A
TestShareOfOutRangeValues(column_name='num_column')
Column-level. Tests the share of values out of the min-max range against reference or a defined condition.
Required:
  • column_name
Optional:
  • left
  • right
Test conditions:
  • standard parameters
Expects all values to be in range.
TestNumberOfOutRangeValues(column_name='num_column')
Column-level. Tests the number of values out of the min-max range against reference or a defined condition.
Required:
  • column_name
Optional:
  • left
  • right
Test conditions:
  • standard parameters
Expects all values to be in range. With reference: the test fails if at least 1 value is out of range (as seen in reference). No reference: N/A
TestValueList(column_name='cat_column')
Column-level. Tests if a categorical column contains values out of the list.
Required:
  • column_name
Optional:
  • values: List[str]
Test conditions: N/A
Expects all values to be in the list. With reference: the test fails if the column contains values out of the list (as seen in reference). No reference: N/A
TestNumberOfOutListValues(column_name='cat_column')
Column-level. Tests the number of values in a given column that are out of list, against reference or a defined condition.
Required:
  • column_name
Optional:
  • values: List[str]
Test conditions:
  • standard parameters
Expects all values to be in the list. With reference: the test fails if the column contains values out of the list (as seen in reference). No reference: N/A
TestShareOfOutListValues(column_name='cat_column')
Column-level. Tests the share of values in a given column that are out of list against reference or a defined condition.
Required:
  • column_name
Optional:
  • values: List[str]
Test conditions:
  • standard parameters
Expects all values to be in the list. With reference: the test fails if the column contains values out of the list (as seen in reference). No reference: N/A
TestColumnQuantile(column_name='num_column', quantile=0.25)
Column-level. Computes a quantile value and compares it to the reference or against a defined condition.
Required:
  • column_name
  • quantile
Optional: N/A Test conditions:
  • standard parameters
Expects +/-10%. With reference: the test fails if the quantile value is over 10% higher or lower. No reference: N/A

Data Drift

Defaults for Data Drift. By default, all data drift tests use the Evidently drift detection logic that selects a different statistical test or metric based on feature type and volume. You always need a reference dataset.
To modify the logic or select a different test, you should set data drift parameters.
Test name
Description
Parameters
Default test conditions
TestNumberOfDriftedColumns()
Dataset-level. Compares the distribution of each column in the current dataset to the reference and tests the number of drifting features against a defined condition.
Required: N/A Optional:
  • сolumns
  • stattest(default=automated selection)
  • cat_stattest
  • num_stattest
  • per_column_stattest
  • stattest_threshold(default=test default)
  • cat_stattest_threshold
  • num_stattest_threshold
  • per_column_stattest_threshold
Test conditions:
  • standard parameters
Expects =< ⅓ features to drift. With reference: If > 1/3 of features drifted, the test fails. No reference: N/A
TestShareOfDriftedColumns()
Dataset-level. Compares the distribution of each column in the current dataset to the reference and tests the share of drifting features against a defined condition.
Required: N/A Optional:
  • сolumns
  • stattest(default=automated selection)
  • cat_stattest
  • num_stattest
  • per_column_stattest
  • stattest_threshold(default=test default)
  • cat_stattest_threshold
  • num_stattest_threshold
  • per_column_stattest_threshold
Test conditions:
  • standard parameters
Expects =< ⅓ features to drift. With reference: If > 1/3 of features drifted, the test fails. No reference: N/A
TestColumnDrift(column_name='name')
Column-level. Tests if there is a distribution shift in a given column compared to the reference.
Required:
  • column_name
Optional:
  • stattest(default=automated selection)
  • stattest_threshold(default=test default)
Expects no drift. With reference: the test fails if the distribution drift is detected in a given column. No reference: N/A

Regression

Defaults for Regression tests: if there is no reference data or defined conditions, Evidently will compare the model performance to a dummy model that predicts the optimal constant (varies by the metric).
You can also pass the reference dataset and run the test with default conditions, or define custom test conditions.
Test name
Description
Parameters
Default test conditions
TestValueMAE()
Dataset-level. Computes the Mean Absolute Error (MAE) and compares it to the reference or against a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects +/-10% or better than a dummy model. With reference: if MAE is higher or lower by over 10%, the test fails. No reference: the test fails if the MAE value is higher than the MAE of the dummy model that predicts the optimal constant (median of the target value).
TestValueRMSE()
Dataset-level. Computes the Root Mean Square Error (RMSE) and compares it to the reference or against a defined condition.
Required: N/A Optional: N/A Test conditions
  • standard parameters
Expects +/-10% or better than a dummy model. With reference: if RMSE is higher or lower by over 10%, the test fails. No reference: the test fails if the RMSE value is higher than the RMSE of the dummy model that predicts the optimal constant (mean of the target value).
TestValueMeanError()
Dataset-level. Computes the Mean Error (ME) and tests if it is near zero or compares it against a defined condition.
Required: N/A Optional: N/A Test conditions
  • standard parameters
Expects the Mean Error to be near zero. With/without reference: the test fails if the Mean Error is skewed and the condition is violated. Condition: eq = approx(absolute=0.1*error_std) error_std = (curr_true - curr_preds).std()
TestValueMAPE()
Dataset-level. Computes the Mean Absolute Percentage Error (MAPE) and compares it to the reference or against a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects +/-10% or better than a dummy model. With reference: if MAPE is higher or lower by over 10%, the test fails. No reference: the test fails if the MAPE value is higher than the MAPE of the dummy model that predicts the optimal constant (weighted median of the target value).
TestValueAbsMaxError()
Dataset-level. Computes the absolute maximum error and compares it to the reference or against a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects +/-10% or better than a dummy model. With reference: if the absolute maximum error is higher or lower by over 10%, the test fails. No reference: the test fails if the absolute maximum error is higher than the absolute maximum error of the dummy model that predicts the optimal constant (median of the target value).
TestValueR2Score()
Dataset-level. Computes the R2 Score (coefficient of determination) and compares it to the reference or against a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects +/-10% or > 0. With reference: if R2 is higher or lower by over 10%, the test fails. No reference: the test fails if the R2 value is =< 0.

Classification

You can apply the tests for non-probabilistic, probabilistic classification, and ranking. The underlying metrics will be calculated slightly differently depending on the provided inputs: only labels, probabilities, decision threshold, and/or K (to compute, e.g., [email protected]).
Defaults for Classification tests. If there is no reference data or defined conditions, Evidently will compare the model performance to a dummy model. It is based on a set of heuristics to verify that the quality is better than random.
You can also pass the reference dataset and run the test with default conditions, or define custom test conditions.
Test name
Description
Parameters
Default test conditions
TestAccuracyScore()
Dataset-level. Computes the Accuracy and compares it to the reference or against a defined condition.
Required: N/A Optional:
  • threshold_probas(default for classification = None; default for probabilistic classification = 0.5)
  • k
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: if the Accuracy is over 20% higher or lower, the test fails. No reference: if the Accuracy is lower than the Accuracy of the dummy model, the test fails.
TestPrecisionScore()
Dataset-level. Computes the Precision and compares it to the reference or against a defined condition.
Required: N/A Optional:
  • threshold_probas(default for classification = None; default for probabilistic classification = 0.5)
  • k
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: if the Precision is over 20% higher or lower, the test fails. No reference: if the Precision is lower than the Precision of the dummy mode, the test fails.
TestRecallScore()
Dataset-level. Computes the Recall and compares it to the reference or against a defined condition.
Required: N/A Optional:
  • threshold_probas(default for classification = None; default for probabilistic classification = 0.5)
  • k
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: if the Recall is over 20% higher or lower, the test fails. No reference: if the Recall is lower than the Recall of the dummy model, the test fails.
TestF1Score()
Dataset-level. Computes the F1 score and compares it to the reference or against a defined condition.
Required: N/A Optional:
  • threshold_probas(default for classification = None; default for probabilistic classification = 0.5)
  • k
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: if the F1 is over 20% higher or lower, the test fails. No reference: if the F1 is lower than the F1 of the dummy model, the test fails.
TestPrecisionByClass(label='classN')
Dataset-level. Computes the Precision for the specified class and compares it to the reference or against a defined condition.
Required:
  • label
Optional:
  • probas_threshold(default for classification = None; default for probabilistic classification = 0.5)
  • k (default = None)
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: if the Precision is over 20% higher or lower, the test fails. No reference: if the Precision is lower than the Precision of the dummy model, the test fails.
TestRecallByClass(label='classN')
Dataset-level. Computes the Recall for the specified class and compares it to the reference or against a defined condition.
Required:
  • label
Optional:
  • probas_threshold(default for classification = None; default for probabilistic classification = 0.5)
  • k (default = None)
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: if the Recall is over 20% higher or lower, the test fails. No reference: if the Recall is lower than the Recall of the dummy model, the test fails.
TestF1ByClass(label='classN')
Dataset-level. Computes the F1 for the specified class and compares it to the reference or against a defined constraint.
Required:
  • label
Optional:
  • probas_threshold(default for classification = None; default for probabilistic classification = 0.5)
  • k (default = None)
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: the test fails if the F1 is over 20% higher or lower. No reference: the test fails if the F1 is lower than the F1 of the dummy model.
TestTPR()
Dataset-level. Computes the True Positive Rate and compares it to the reference or against a defined condition.
Required: N/A Optional:
  • probas_threshold(default for classification = None; default for probabilistic classification = 0.5)
  • k (default = None)
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: the test fails if the TPR is over 20% higher or lower. No reference: the test fails if the TPR is lower than the TPR of the dummy model.
TestTNR()
Dataset-level. Computes the True Negative Rate and compares it to the reference or against a defined condition.
Required: N/A Optional:
  • probas_threshold(default for classification = None; default for probabilistic classification = 0.5)
  • k (default = None)
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: the test fails if the TNR is over 20% higher or lower. No reference: the test fails if the TNR is lower than the TNR of the dummy model.
TestFPR()
Dataset-level. Computes the False Positive Rate and compares it to the reference or against a defined condition.
Required: N/A Optional:
  • probas_threshold(default for classification = None; default for probabilistic classification = 0.5)
  • k (default = None)
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: the test fails if the FPR is over 20% higher or lower. No reference: the test fails if the FPR is higher than the FPR of the dummy model.
TestFNR()
Dataset-level. Computes the False Negative Rate and compares it to the reference or against a defined condition.
Required: N/A Optional:
  • probas_threshold(default for classification = None; default for probabilistic classification = 0.5)
  • k (default = None)
Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: the test fails if the FNR is over 20% higher or lower. No reference: the test fails if the FNR is higher than the FNR of the dummy model.
TestRocAuc()
Dataset-level. Applies to probabilistic classification. Computes the ROC AUC and compares it to the reference or against a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects +/-20% or > 0.5 With reference: the test fails if the ROC AUC is over 20% higher or lower than in the reference. No reference: the test fails if ROC AUC is <= 0.5.
TestLogLoss()
Dataset-level. Applies to probabilistic classification. Computes the LogLoss and compares it to the reference or against a defined condition.
Required: N/A Optional: N/A Test conditions:
  • standard parameters
Expects +/-20% or better than a dummy model. With reference: the test fails if the LogLoss is over 20% higher or lower than in the reference. No reference: the test fails if LogLoss is higher than the LogLoss of the dummy model (equals 0.5 for a constant model).