Data drift parameters
How to set custom data drift conditions and thresholds for tabular and text data.
Pre-requisites:
You know how to generate Reports or Test Suites with default parameters.
You know how to pass custom parameters for Reports or Test Suites.
You know how to use Column Mapping to set the input data type.
Default
All Presets, Tests, and Metrics that include data or target (prediction) drift evaluation use the default Data Drift algorithm. It automatically selects an appropriate drift detection method based on the feature type and volume.
You can override the defaults by passing a custom parameter to the chosen Test, Metric, or Preset. You can define the drift detection method, the threshold, or both.
Code example
You can refer to an example How-to-notebook showing how to pass custom drift parameters:
Examples
To set a custom drift method and threshold on the column level:
If you have a Preset, Test or Metric that checks for drift in multiple columns at the same time, you can set a custom drift method for all columns, all numerical/categorical columns, or for each column individually.
Here is how you set the drift detection method for all categorical columns:
To set a custom condition for the dataset drift (share of drifting columns in the dataset) in the relevant Metrics or Presets:
Note that this works slightly differently for Tests. To set a custom condition for the dataset drift when you run a relevant Test, you should set a condition for the share of drifted features using standard lt
and gt
parameters:
When you set drift threshold for ColumnDriftTest()
, you should use stattest_threshold
and other parameters the same way as it works in Metrics (not lt
and gt
).
Tabular drift detection
The following methods and parameters apply to tabular data (as parsed automatically or specified as numerical or categorical columns in the column mapping).
Drift parameters - Tabular
The following drift detection parameters are available in the DataDriftTable()
, DatasetDriftMetric()
, ColumnDriftMetric()
, related Tests, and Presets that contain them.
How to check available parameters. You can verify which parameters are available for a specific test, metric, or preset in the All tests or All metrics tables or consult the API reference
Drift detection methods - Tabular
To use the following drift detection methods, pass them using the stattest
parameter.
Text drift detection
Text drift detection applies to columns with raw text data, as specified in column mapping.
Embedding drift detection. If you work with embeddings, you can use Embeddings Drift Detection methods.
Drift parameters - Text
The following text drift detection parameters are available in the DataDriftTable()
, DatasetDriftMetric()
, ColumnDriftMetric()
, related Tests and Presets that contain them.
Drift detection methods - Text
To use the following text drift detection methods, pass them using the stattest
parameter.
Text descriptors drift
You can also check for distribution drift in text descriptors (such as text length, etc.)
To use this method, call a separate TextDescriptorsDriftMetric()
. You can pass any of the tabular drift detection methods as a parameter.
Last updated