How to run regression testing for LLM outputs.
CloudWorkspace
with Workspace
.New toy data generation
tracely
library to instrument your app and get traces as a tabular dataset. Check the tutorial with tracing workflow.target_response
column and provide reasoning for its decision.
CorrectnessLLMEval()
to use a default prompt.TextLength()
descriptor.
TextEvals
to summarize all descriptors.
gt
(greater than), lt
(less than), eq
(equal), etc. (Check Test docs).
share_tests
instead of tests
.eval_dataset
that we prepared earlier, and send it to the Evidently Cloud.
my_eval
or my_eval.json()
. eval_data_2
to imitate the result of the change.
New toy data generation