CLI
How to use Evidently from the Terminal.
This section applies both to Dashboards and Profiles.
Command line interface
If you prefer a video version, here is 7-min Quick Start on how to use Evidently using CLI.
To start, prepare your data as csv
files.
To generate the HTML report, run the following command in bash:
To generate a JSON profile, run the following command in bash:
Here:
reference
is the path to the reference data,current
is the path to the current data,output
is the path to the output folder,config
is the path to the configuration file,pretty_print
to print the JSON profile with indents (for profile only).
You can choose the following Tabs:
data_drift
to estimate the data drift,num_target_drift
to estimate target drift for the numerical targetcat_target_drift
to estimate target drift for the categorical targetregression_performance
to explore the performance of a regression modelclassification_performance
to explore the performance of a classification modelprob_classification_performance
to explore the performance of a probabilistic classification model
To configure the report you need to create the config.json
file or a config.yaml
file. This file configures the way of reading your input data and the type of the report.
Configuration examples
Here is an example of a simple configuration, where we have comma-separated csv
files with headers and there is no date
column in the data.
Dashboard:
Profile:
Here is an example for a more complicated configuration, where we have comma-separated csv
files with headers and datetime
column. We also specified the column_mapping
dictionary, where we added information about the datetime
, target
and numerical_features
.
Dashboard:
Profile:
Telemetry
Telemetry is collected in Evidently starting from version 0.1.21.dev0.
When you use Evidently in the command-line interface, we collect some basic telemetry. It includes data on the environment (e.g. Python version) and usage (type of report or profile generated). You can read more about what we collect here.
You can opt-out from telemetry collection by setting the environment variable:
Sampling for large datasets
As shown in the configuration example above, you can specify sampling parameters for large files. You can use different sampling strategies for the reference and current data, or apply sampling only to one of the files.
Currently, you can choose from the following options:
none
- no sampling will be appliednth
- each Nth row of the file will be taken. This option works together with then
parameter (see the example with the Dashboard above)random
- random sampling will be applied. This option works together withratio
parameter (see the example with the Profile above)
If you do not specify the sampling parameters in the configuration, it will be treated as none and no sampling will be applied.
Last updated