Workflows

Common evaluation patterns for Eqho agents. Each workflow builds on the core eqho-eval commands and can be combined.

Workflow index

Multi-turn conversations — Test full conversations with simulated users derived from real calls
Multi-model comparison — Test your agent across GPT, Claude, Gemini, and more in a single run
Tool call validation — Verify agents call the right tools with correct arguments
Safety testing — Catch prompt injection, identity leakage, and refusal gaps
Agentic workflows — Use Claude Code or Cursor to iterate on evals with natural language
CI / CD — Automate evaluations in GitHub Actions

Eqho agents have a full call lifecycle that can be tested at each stage:

Pre-Call → On-Call-Start → Live Actions → Postcall Actions → Disposition

Test whether the agent calls the right tools during conversation:

eqho-eval action-eval --campaign <id>
cd action-eval && eqho-eval eval

Test data extraction accuracy from call transcripts:

eqho-eval postcall-eval --campaign <id> --calls 25
cd postcall-eval && eqho-eval eval

Test call outcome categorization accuracy:

eqho-eval postcall-eval --campaign <id> --disposition --calls 50
cd disposition-eval && eqho-eval eval

All generated configs include proxy settings automatically.