18 MetricsResearch-backed evaluation metrics for RAG, agentic, conversational, and safety use cases — all out of the box.
Vitest / Jest IntegrationRun evaluations inside the test runner you already use with familiar describe/it/expect patterns.
Provider AgnosticWorks with OpenAI, Anthropic, Google Gemini, Azure OpenAI, and Ollama. Bring your own provider too.
CLI ToolRun evaluations from the command line with assay run, scaffold configs with assay init, and list available metrics.
AI SDK AdapterPipe Vercel AI SDK generateText and streamText results straight into evaluation with zero boilerplate.