Use this file to discover all available pages before exploring further.
To run a check not available in Evidently, you can implement it as a custom function. Use this for building your own programmatic evaluators.
You can also customize existing evals with parameters, such as defining custom LLM judges or using regex-based metrics like Contains for word lists. See available descriptors.
You can define a CustomColumnDescriptor that will:
take any column from your dataset to evaluate each value inside it
return a single column with numerical (num) scores or categorical (cat) labels.
Implement it as a Python function that takes a Pandas Series as input and return a transformed Series. For example, to check if the column is empty:
def is_empty(data: DatasetColumn) -> DatasetColumn: return DatasetColumn( type="cat", data=pd.Series([ "EMPTY" if val == "" else "NON EMPTY" for val in data.data]))
You can alternatively define a CustomDescriptor that:
Takes one or many named columns from your dataset,
Returns one or many transformed columns.
Pairwise evaluation. For example, to check exact match between target_answer and answer columns, and return a label:
def exact_match(dataset: Dataset) -> DatasetColumn: return DatasetColumn( type="cat", data=pd.Series([ "MATCH" if val else "MISMATCH" for val in dataset.column("target_answer").data == dataset.column("answer").data]))
Multiple scores. You can also use CustomDescriptor to run evals for multiple columns and return multiple scores.As a fun example, let’s reverse all words in the question and answer columns:
from typing import Union, Dictdef reverse_text(dataset: Dataset) -> Union[DatasetColumn, Dict[str, DatasetColumn]]: return { "reversed_question": DatasetColumn( type="cat", data=pd.Series([ value[::-1] for value in dataset.column("question").data])), "reversed_answer": DatasetColumn( type="cat", data=pd.Series([ value[::-1] for value in dataset.column("answer").data]))}