All Metrics
Reference page for all dataset-level evals.
For an intro, read Core Concepts and check quickstarts for LLMs or ML.
Text Evals
Summarizes results of text or LLM evals. To score individual inputs, first use descriptors.
Data definition. You may need to map text columns.
Metric | Description | Parameters | Test Defaults |
---|---|---|---|
TextEvals() |
| Optional:
| As in Metrics included in ValueStats |
Columns
Use to aggregate descriptor results or check data quality on column level.
You may need to map column types using Data definition.
Value stats
Descriptive statistics.
Metric | Description | Parameters | Test Defaults |
---|---|---|---|
ValueStats() |
| Required:
|
|
MinValue() |
| Required:
|
|
StdValue() |
| Required:
|
|
MeanValue() |
| Required:
|
|
MaxValue() |
| Required:
|
|
MedianValue() |
| Required:
|
|
QuantileValue() |
| Required:
|
|
CategoryCount() Example: CategoryCount( column="city", category="NY") |
| Required:
|
|
Column data quality
Column-level data quality metrics.
Data definition. You may need to map column types.
Metric | Description | Parameters | Test Defaults |
---|---|---|---|
MissingValueCount() |
| Required:
|
|
InRangeValueCount() Example: InRangeValueCount( column="age", left="1", right="18") |
| Required:
|
|
OutRangeValueCount() |
| Required:
|
|
InListValueCount() |
| Required:
|
|
OutListValueCount() Example: OutListValueCount( column="city", values=["Lon", "NY"]) |
| Required:
|
|
UniqueValueCount() |
| Required:
|
|
Dataset
Use for exploratory data analysis and data quality checks.
Data definition. You may need to map column types, ID and timestamp.
Dataset stats
Descriptive statistics.
Metric | Description | Parameters | Test Defaults |
---|---|---|---|
DataSummaryPreset() |
| Optional:
| As in individual Metrics. |
DatasetStats() |
| None |
|
RowCount() |
| Optional: |
|
ColumnCount() |
| Optional: |
|
Dataset data quality
Dataset-level data quality metrics.
Data definition. You may need to map column types, ID and timestamp.
Metric | Description | Parameters | Test Defaults |
---|---|---|---|
ConstantColumnsCount() |
| Optional: |
|
EmptyRowsCount() |
| Optional: |
|
EmptyColumnsCount() |
| Optional: |
|
DuplicatedRowCount() |
| Optional: |
|
DuplicatedColumnsCount() |
| Optional: |
|
DatasetMissingValueCount() |
| Required:
|
|
AlmostConstantColumnsCount() |
| Optional: |
|
ColumnsWithMissingValuesCount() |
| Optional: |
|
Data Drift
Use to detect distribution drift for text, tabular, embeddings data or over computed text descriptors. Checks 20+ drift methods listed separately: text and tabular.
Data definition. You may need to map column types, ID and timestamp.
Metrics explainers. Understand how data drift works.
Metric | Description | Parameters | Test Defaults |
---|---|---|---|
DataDriftPreset() |
| Optional:
|
|
DriftedColumnsCount() |
| Optional:
|
|
ValueDrift() |
| Required:
|
|
Classification
Use to evaluate quality on a classification task (probabilistic, non-probabilistic, binary and multi-class).
Data definition. You may need to map prediction, target columns and classification type.
General
Use for binary classification and aggregated results for multi-class.
Metric | Description | Parameters | Test Defaults |
---|---|---|---|
ClassificationPreset() |
| Optional: probas_threshold . | As in individual Metrics. |
ClassificationQuality() |
| Optional: probas_threshold | As in individual Metrics. |
Accuracy() |
| Optional: |
|
Precision() |
| Required:
|
|
Recall() |
| Required:
|
|
F1Score() |
| Required:
|
|
TPR() |
| Required:
|
|
TNR() |
| Required:
|
|
FPR() |
| Required:
|
|
FNR() |
| Required:
|
|
LogLoss() |
| Required:
|
|
RocAUC() |
| Required:
|
|
Dummy metrics:
By label
Use when you have multiple classes and want to evaluate quality separately.
Metric | Description | Parameters | Test Defaults | |
---|---|---|---|---|
ClassificationQualityByLabel() |
| None | As in individual Metrics. | |
PrecisionByLabel() |
| Optional:
|
| |
F1ByLabel() |
| Optional:
|
| |
RecallByLabel() |
| Optional:
|
| |
RocAUCByLabel() |
| Optional:
|
|
Regression
Use to evaluate the quality of a regression model.
Data definition. You may need to map prediction and target columns.
Metric | Description | Parameters | Test Defaults |
---|---|---|---|
RegressionPreset |
| None. | As in individual metrics. |
RegressionQuality |
| None. | As in individual metrics. |
MeanError() |
| Required:
|
|
MAE() |
| Required:
|
|
RMSE() |
| Optional: |
|
MAPE() |
| Required:
|
|
R2Score() |
| Optional: |
|
AbsMaxError() |
| Optional: |
|
Dummy metrics:
Ranking
Use to evaluate ranking, search / retrieval or recommendations.
Data definition. You may need to map prediction and target columns and ranking type.
Metric explainers. Check ranking metrics explainers.
Metric | Description | Parameters | Test Defaults |
---|---|---|---|
RecallTopK() |
| Required:
|
|
FBetaTopK() |
| Required:
|
|
PrecisionTopK() |
| Required:
|
|
MAP() |
| Required:
|
|
NDCG() |
| Required:
|
|
MRR() |
| Required:
|
|
HitRate() |
| Required:
|
|
ScoreDistribution() |
| Required:
|
|