Evidently and Metaflow
Run model evaluation or data drift analysis as Metaflow Flow and save the Evidently metrics in S3, visualizing it with the optional Metaflow UI.
Metaflow is an open-source framework to helps scientists and engineers build and manage real-life data science projects.
You can use this integration to generate Evidently HTML reports and test suites, executed via a Metaflow Flow and visualize it as a Card - using the metaflow-card-html plugin.
Many machine learning teams use Metaflow to orchestrate the multiple stages of ML lifecycle, such as data preparation, training, deployment, serving predictions, and as a model registry.
If you are already familiar with Metaflow, here is an example on how to integrate it with Evidently to track the quality of data and the data drift.
In this case, Metaflow will orchestrate the execution of the Flow, using Evidently to calculate the metrics/tests and generate the visual report and Metaflow to log the HTML results as an artefact. You can then access the metrics in the Metaflow UI interface - or retrieve it via the cards api.
With Metaflow, you can organize your Batch process into multiple Flows, such as:
- 1.TrainingFlow: retrieve data, split into train/test, train multiple models in parallel, identify the best and store it as an artifact
- 2.ServingFlow: from the latest successful TrainingFlow, retrieves the best model and use it to make predictions on the new data
- 3.MonitoringFlow: triggered by the
ServingFlow
, retrieves the data used in each last successful Flow and calculates the desired metrics, such as data quality and data drift, wherereference
is the data used in theTrainingFlow
andcurrent
comes from theServingFlow
Note: Evidently calculates a rich set of metrics and statistical tests. You can choose any of the pre-built reports and test suites to define the type of analysis you’d want to get.
Within every Flow, it is possible to store artifacts that can be visualised with the
card
feature. This way, you can save the HTML content of the Evidently reports to be visualized with the metaflow-card-html
plugin.In this example, we will use Evidently to check input features for Data Drift and log and visualize the resulting report with Metaflow.
Evidently is available as a PyPI package:
$ pip install evidently
To install Metaflow, run:
$ pip install metaflow
Install the
metaflow-card-html
plugin:$ pip install metaflow-card-html
And any other dependencies, such as scikit-learn.
We will use the following helper function to simplify obtaining the final fully rendered HTML content for the Evidently reports.
def get_evidently_html(evidently_object) -> str:
"""Returns the rendered EvidentlyAI report/metric as HTML
Should be assigned to `self.html`, installing `metaflow-card-html` to be rendered
"""
import tempfile
with tempfile.NamedTemporaryFile() as tmp:
evidently_object.save_html(tmp.name)
with open(tmp.name) as fh:
return fh.read()
The
start
step is based on the Evidently getting started tutorial, preparing the data to be used in the following steps.The
monitoring_data_quality
behaves as a step, due to the @mf.step
decorator. The @mf.card(type='html')
decorator adds behavior, ensuring the attribute self.html
will be stored and properly rendered as HTML in the Card.import metaflow as mf
class GettingStartedEvidentlyFlow(mf.FlowSpec):
@mf.step
def start(self):
import numpy as np
from sklearn.datasets import fetch_california_housing
data = fetch_california_housing(as_frame=True)
housing_data = data.frame
housing_data.rename(columns={"MedHouseVal": "target"}, inplace=True)
housing_data["prediction"] = housing_data["target"].values + np.random.normal(
0, 5, housing_data.shape[0]
)
self.reference = housing_data.sample(n=5000, replace=False)
self.current = housing_data.sample(n=5000, replace=False)
self.next(self.monitoring_data_quality)
@mf.card(type="html")
@mf.step
def monitoring_data_quality(self):
import os
os.system("pip install evidently --quiet")
from evidently.test_preset import DataStabilityTestPreset
from evidently.test_suite import TestSuite
from tools.helper_functions import get_evidently_html
print("Monitoring: data quality tests")
data_stability = TestSuite(
tests=[
DataStabilityTestPreset(),
]
)
data_stability.run(reference_data=self.reference, current_data=self.current)
self.html = get_evidently_html(data_stability)
self.next(self.end)
@mf.step
def end():
print("Flow completed")
Which can be executed with the command:
$ python <path_to_the_metaflow_script>.py run
The respective card can be visualized in multiple ways, such as via the optional Metaflow UI, the api client, or just simply using the command line interface:
$ python <path_to_the_metaflow_script>.py card view <step_name>
Here is an example of the Card in the Metaflow UI:

Last modified 5mo ago