For an intro, read Core Concepts and check quickstarts for LLMs or ML. For a reference code example, see this Metric cookbook.
How to read the tables
How to read the tables
- 
Metric: the name of Metric or Preset you can pass to Report.
- Description: what it does. Complex Metrics link to explainer pages.
- 
Parameters: available options. You can also add conditional teststo any Metric with standard operators likeeq(equal),gt(greater than), etc. How Tests work.
- 
Test defaults are conditions that apply when you invoke Tests but do not set a pass/fail condition yourself.
- 
With reference: if you provide a reference dataset during the Report run, the conditions are set relative to reference.
- No reference: if you do not provide a reference, Tests will use fixed heuristics (like expect no missing values).
 
- 
With reference: if you provide a reference dataset during the Report 
Text Evals
Summarizes results of text or LLM evals. To score individual inputs, first use descriptors.Data definition. You may need to map text columns.
| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| TextEvals() | 
 | Optional: 
 | As in Metrics included in ValueStats | 
Columns
Use to aggregate descriptor results or check data quality on column level.You may need to map column types using Data definition.
Value stats
Descriptive statistics.| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| ValueStats() | 
 | Required: 
 | 
 | 
| MinValue() | 
 | Required: 
 | 
 | 
| StdValue() | 
 | Required: 
 | 
 | 
| MeanValue() | 
 | Required: 
 | 
 | 
| MaxValue() | 
 | Required: 
 | 
 | 
| MedianValue() | 
 | Required: 
 | 
 | 
| QuantileValue() | 
 | Required: 
 
 | 
 | 
| CategoryCount() Example: CategoryCount(column="city", category="NY") | 
 | Required: 
 | 
 | 
Column data quality
Column-level data quality metrics.Data definition. You may need to map column types.
| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| MissingValueCount() | 
 | Required: 
 | 
 | 
| InRangeValueCount() Example: InRangeValueCount(column="age",left="1", right="18") | 
 | Required: 
 | 
 | 
| OutRangeValueCount() | 
 | Required: 
 | 
 | 
| InListValueCount() | 
 | Required: 
 | 
 | 
| OutListValueCount() Example: OutListValueCount(column="city", values=["Lon", "NY"]) | 
 | Required: 
 | 
 | 
| UniqueValueCount() | 
 | Required: 
 | 
 | 
Dataset
Use for exploratory data analysis and data quality checks.Data definition. You may need to map column types, ID and timestamp.
Dataset stats
Descriptive statistics.| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| DataSummaryPreset() | 
 | Optional: 
 | As in individual Metrics. | 
| DatasetStats() | 
 | None | 
 | 
| RowCount() | 
 | Optional: | 
 | 
| ColumnCount() | 
 | Optional: | 
 | 
Dataset data quality
Dataset-level data quality metrics.Data definition. You may need to map column types, ID and timestamp.
| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| ConstantColumnsCount() | 
 | Optional: | 
 | 
| EmptyRowsCount() | 
 | Optional: | 
 | 
| EmptyColumnsCount() | 
 | Optional: | 
 | 
| DuplicatedRowCount() | 
 | Optional: | 
 | 
| DuplicatedColumnsCount() | 
 | Optional: | 
 | 
| DatasetMissingValueCount() | 
 | Required: 
 | 
 | 
| AlmostConstantColumnsCount() | 
 | Optional: | 
 | 
| ColumnsWithMissingValuesCount() | 
 | Optional: | 
 | 
Data Drift
Use to detect distribution drift for text and tabular data or over computed text descriptors. Checks 20+ drift methods listed separately: text and tabular.Data definition. You may need to map column types, ID and timestamp.
Metrics explainers. Understand how data drift works.
| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| DataDriftPreset() | 
 | Optional: 
 | 
 | 
| DriftedColumnsCount() | 
 | Optional: 
 | 
 | 
| ValueDrift() | 
 | Required: 
 
 | 
 | 
Classification
Use to evaluate quality on a classification task (probabilistic, non-probabilistic, binary and multi-class).Data definition. You may need to map prediction, target columns and classification type.
General
Use for binary classification and aggregated results for multi-class.| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| ClassificationPreset() | 
 | Optional: probas_threshold. | As in individual Metrics. | 
| ClassificationQuality() | 
 | Optional: probas_threshold | As in individual Metrics. | 
| Accuracy() | 
 | Optional: | 
 | 
| Precision() | 
 | Required: 
 
 | 
 | 
| Recall() | 
 | Required: 
 
 | 
 | 
| F1Score() | 
 | Required: 
 
 | 
 | 
| TPR() | 
 | Required: 
 
 | 
 | 
| TNR() | 
 | Required: 
 
 | 
 | 
| FPR() | 
 | Required: 
 
 | 
 | 
| FNR() | 
 | Required: 
 
 | 
 | 
| LogLoss() | 
 | Required: 
 
 | 
 | 
| RocAUC() | 
 | Required: 
 
 | 
 | 
Dummy model quality
Dummy model quality
Use these Metics to get the quality of a dummy model created on the same data (based on heuristics). You can compare your model quality to verify that it’s better than random. These Metrics serve as a baseline in automated testing.
| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| ClassificationDummyQuality() | 
 | N/A | N/A | 
| DummyPrecision() | 
 | N/A | N/A | 
| DummyRecall() | 
 | N/A | N/A | 
| DummyF1() | 
 | N/A | N/A | 
By label
Use when you have multiple classes and want to evaluate quality separately.| Metric | Description | Parameters | Test Defaults | |
|---|---|---|---|---|
| ClassificationQualityByLabel() | 
 | None | As in individual Metrics. | |
| PrecisionByLabel() | 
 | Optional: 
 | 
 | |
| F1ByLabel() | 
 | Optional: 
 | 
 | |
| RecallByLabel() | 
 | Optional: 
 | 
 | |
| RocAUCByLabel() | 
 | Optional: 
 | 
 | 
Regression
Use to evaluate the quality of a regression model.Data definition. You may need to map prediction and target columns.
| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| RegressionPreset | 
 | None. | As in individual metrics. | 
| RegressionQuality | 
 | None. | As in individual metrics. | 
| MeanError() | 
 | Required: 
 
 | 
 | 
| MAE() | 
 | Required: 
 
 | 
 | 
| RMSE() | 
 | Optional: | 
 | 
| MAPE() | 
 | Required: 
 | 
 | 
| R2Score() | 
 | Optional: | 
 | 
| AbsMaxError() | 
 | Optional: | 
 | 
Dummy model quality
Dummy model quality
Use these Metics to get the baseline quality for regression: they use optimal constants (varies by the Metric). These Metrics serve as a baseline in automated testing.
| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| RegressionDummyQuality() | 
 | N/A | N/A | 
| DummyMeanError() | 
 | N/A | N/A | 
| DummyMAE() | 
 | N/A | N/A | 
| DummyMAPE() | 
 | N/A | N/A | 
| DummyRMSE() | 
 | N/A | N/A | 
| DummyR2() | 
 | N/A | N/A | 
Ranking
Use to evaluate ranking, search / retrieval or recommendations.Data definition. You may need to map prediction and target columns and ranking type.
Metric explainers. Check ranking metrics explainers.
| Metric | Description | Parameters | Test Defaults | 
|---|---|---|---|
| RecallTopK() | 
 | Required: 
 
 | 
 | 
| FBetaTopK() | 
 | Required: 
 
 | 
 | 
| PrecisionTopK() | 
 | Required: 
 
 | 
 | 
| MAP() | 
 | Required: 
 
 | 
 | 
| NDCG() | 
 | Required: 
 
 | 
 | 
| MRR() | 
 | Required: 
 
 | 
 | 
| HitRate() | 
 | Required: 
 
 | 
 | 
| ScoreDistribution() | 
 | Required: 
 | 
 |