Data for Classification
How to define the data schema for classification.
To evaluate classification model performance, you must correctly map the input data schema.
Code example
Column Mapping
To evaluate the classification performance, you need both true labels and prediction. Depending on the classification type (e.g., binary, multi-class, probabilistic), you have different options of how to pass the predictions.
Multiclass classification
Option 1
Target: encoded labels, Preds: encoded labels + Optional[target_names].
target | prediction |
---|---|
1 | 1 |
0 | 2 |
… | … |
2 | 2 |
If you pass the target names, they will appear on the visualizations.
You can also pass the target names as a dictionary:
or
Option 2
Target: labels, Preds: labels.
target | prediction |
---|---|
‘Versicolour’ | ‘Versicolour’ |
‘Setosa’ | ‘Virginica’ |
… | … |
‘Virginica’ | ‘Virginica’ |
Multiclass probabilistic classification
Target: labels, Preds: columns named after labels.
target | ‘Versicolour’ | ‘Setosa’ | ‘Virginica’ |
---|---|---|---|
‘Setosa’ | 0.98 | 0.01 | 0.01 |
‘Virginica’ | 0.5 | 0.2 | 0.3 |
… | … | ||
‘Virginica’ | 0.2 | 0.7 | 0.1 |
Naming the columns after the labels is a requirement. You cannot pass a custom list.
Binary classification
Option 1
Target: encoded labels, Preds: encoded labels + pos_label + Optional[target_names]
target | prediction |
---|---|
1 | 1 |
0 | 1 |
… | … |
1 | 0 |
By default, Evidently expects the positive class to be labeled as ‘1’. If you have a different label, specify it explicitly.
If you pass the target names, they will appear on the visualizations.
Option 2
Target: labels, Preds: labels + pos_label
target | prediction |
---|---|
‘churn’ | ‘churn’ |
‘not_churn’ | ‘churn’ |
… | … |
‘churn’ | ‘not_churn’ |
Passing the name of the positive class is a requirement in this case.
Binary probabilistic classification
Option 1
Target: labels, Preds: columns named after labels + pos_label
target | ‘churn’ | ‘not_churn’ |
---|---|---|
‘churn’ | 0.9 | 0.1 |
‘churn’ | 0.7 | 0.3 |
… | … | |
‘not_churn’ | 0.5 | 0.5 |
Passing the name of the positive class is a requirement in this case.
Option 2
Target: labels, Preds: a column named like one of the labels + pos_label
target | ‘not_churn’ |
---|---|
‘churn’ | 0.5 |
‘not_churn’ | 0.1 |
… | … |
‘churn’ | 0.9 |
Both naming the column after one of the labels and passing the name of the positive class are requirements.
Option 3
Target: encoded labels, Preds: one column with any name + pos_label
target | prediction |
---|---|
1 | 0.5 |
1 | 0.1 |
… | … |
0 | 0.9 |
If you pass the target names, they will appear on the visualizations.
Last updated