Stack Explorer

DeepEval

evaluation framework

Evaluation framework for LLM applications with unit test metrics

Official site

Supported languages

Pros and Cons

Ventajas

  • + Metrics as unit tests
  • + Pytest integration
  • + Multiple metrics available
  • + Red teaming included

Desventajas

  • - Relatively new
  • - Some metrics require LLMs

Casos de Uso

  • LLM application testing
  • RAG evaluation
  • CI/CD for AI

Related Technologies

Alternatives