- Ad hoc analysis. Spot-check the quality of your data or AI outputs.
- Experiments. Test different parameters, models, or prompts and compare outcomes.
- Safety and adversarial testing. Evaluate how your system handles edge cases and adversarial inputs, including on synthetic data.
- Regression testing. Ensure the performance does not degrade after updates or fixes.
- Monitoring. Track the response quality for production systems.
Evaluations via API
Supported in:
Evidently OSS
, Evidently Cloud
and Evidently Enterprise
.
- Run Python-based evaluations on your AI outputs by generating Reports.
- Upload results to the Evidently Platform.
- Use the Explore feature to compare and debug results between runs.
No-code evaluations
Supported in
Evidently Cloud
and Evidently Enterprise
.
- Analyze CSV datasets. Drag and drop CSV files and evaluate their contents on the Platform.
- Evaluate uploaded datasets. Assess collected traces from instrumented LLM applications or any Datasets you previously uploaded or generated.