Data for Classification
How to define the data schema for classification.
Last updated
How to define the data schema for classification.
Last updated
To evaluate classification model performance, you must correctly map the input data schema.
To evaluate the classification performance, you need both true labels and prediction. Depending on the classification type (e.g., binary, multi-class, probabilistic), you have different options of how to pass the predictions.
Target: encoded labels, Preds: encoded labels + Optional[target_names].
1
1
0
2
…
…
2
2
If you pass the target names, they will appear on the visualizations.
You can also pass the target names as a dictionary:
or
Target: labels, Preds: labels.
‘Versicolour’
‘Versicolour’
‘Setosa’
‘Virginica’
…
…
‘Virginica’
‘Virginica’
Target: labels, Preds: columns named after labels.
‘Setosa’
0.98
0.01
0.01
‘Virginica’
0.5
0.2
0.3
…
…
‘Virginica’
0.2
0.7
0.1
Naming the columns after the labels is a requirement. You cannot pass a custom list.
Target: encoded labels, Preds: encoded labels + pos_label + Optional[target_names]
1
1
0
1
…
…
1
0
By default, Evidently expects the positive class to be labeled as ‘1’. If you have a different label, specify it explicitly.
If you pass the target names, they will appear on the visualizations.
Target: labels, Preds: labels + pos_label
‘churn’
‘churn’
‘not_churn’
‘churn’
…
…
‘churn’
‘not_churn’
Passing the name of the positive class is a requirement in this case.
Target: labels, Preds: columns named after labels + pos_label
‘churn’
0.9
0.1
‘churn’
0.7
0.3
…
…
‘not_churn’
0.5
0.5
Passing the name of the positive class is a requirement in this case.
Target: labels, Preds: a column named like one of the labels + pos_label
‘churn’
0.5
‘not_churn’
0.1
…
…
‘churn’
0.9
Both naming the column after one of the labels and passing the name of the positive class are requirements.
Target: encoded labels, Preds: one column with any name + pos_label
1
0.5
1
0.1
…
…
0
0.9
If you pass the target names, they will appear on the visualizations.