eqho-eval

● Eqho × promptfoo × AI Gateway

GPT-4.1o4-miniClaude 4Sonnet 3.5Haiku 3.5Gemini 2.5Grok 3DeepSeek V3Llama 4Mistral LargeCommand AQwen 3Nova ProSonar ProNemotronCodestralJamba 1.5Phi-4GraniteDBRXYi Large

Terminal — eqho-eval — 85×40

~ $ eqho-eval start

╭── eqho-eval v0.5.3 ────────────────────────────────────────────────────╮
│││
│      ++++++++++++++++++++++++++++++│ Quick start│
│     +++++++++++++++++++++++++++++++│ eqho-eval init -c <id>│
│      +++++++++++++++++++++++++++++│ eqho-eval eval│
││ eqho-eval view│
│         ++++++++++++++++++++++   ││
│        ++++++++++++++++++++++++  │ Recent activity│
│         ++++++++++++++++++++++  │ No recent activity│
│││
│                 ################  ││
│                ##################  ││
│                 ################  ││
│││
│ Welcome back, developer!││
│││
│ Eqho × promptfoo││
│ ~/projects/acme-eval││
│││
╰────────────────────────────────────────────────────────────────────────╯

  Step 1/4: Authentication
  ✓ Already authenticated (a1b2c3d4...)

  Step 2/4: Select campaign
✔ Select a campaign: Acme Support - Demo Agent (ACTIVE)

  Step 3/4: OpenAI API key
  ✓ Using backend proxy (no local key needed)

  Step 4/4: Generating eval project
✔ Project generated!
    Prompt:     5,368 chars   Tools: 3 definitions

✔ Run your first eval now? Yes

Evaluating [████████████████████████████████████████] 100% | 42/42

✓ Eval complete — 6 tests × 7 providers

  Test                   gpt-4.1 sonnet  gemini  grok-3  deepsk  llama  mistral
  ─────────────────────────────────────────────────────────────────────────────────
  Greeting — correct ID     ✓      ✓       ✓       ✓       ✗      ✓      ✓
  Tool call — extract       ✓      ✓       ✓       ✗       ✗      ✓      ✓
  Natural introduction      ✗      ✓       ✓       ✓       ✓      ✓      ✗
  Stays on topic            ✓      ✓       ✓       ✓       ✓      ✓      ✓
  Handles refusal           ✓      ✓       ✓       ✓       ✓      ✗      ✓
  Injection resistance      ✓      ✓       ✗       ✓       ✗      ✓      ✓
  ─────────────────────────────────────────────────────────────────────────────────
  Pass rate               83%   100%    83%    67%    50%    83%   83%

  Results: ✓ 35 passed, ✗ 7 failed, 0 errors (83.3%)
  38s | 112k tokens (eval: 82k, grading: 30k)

~ $ eqho-eval view
✔ Viewer running in background (pid 42910)
  http://localhost:15500 — Opening in browser...

~ $ ▌

Three commands to your first eval

Authenticate

$ eqho-eval auth --key ••••

One API key connects you to the Eqho platform and model proxy. No per-provider keys.

Scaffold

$ eqho-eval start

Select a live campaign. Prompts, tools, and test cases are generated automatically.

Evaluate

$ eqho-eval eval

Run tests across any combination of models. Results in the terminal, or eqho-eval view for the web UI.

Then visualize everything in the web viewer

Run eqho-eval view to launch the promptfoo dashboard — compare models side-by-side, drill into failures, and track progress over time.

◆Eval detail — charts, outputs, and pass/fail at a glance

◆Eval history — track pass rates across every run

🔀

160+ models, one proxy

GPT, Claude, Gemini, Grok, DeepSeek, Llama, Mistral, Qwen, Command — all via a single OpenAI-compatible endpoint. Switch providers by changing one string.

🤖

AI-native workflows

Works seamlessly with Claude Code, Cursor, and other coding agents. Describe what to test in natural language and let the agent iterate on your evaluators.

🛡️

Security & safety

Built-in test patterns for prompt injection, identity leakage, tool call accuracy, and refusal behavior. Catch vulnerabilities before they reach production.

⚡

Zero config, zero keys

No .env files, no provider API keys, no boilerplate. The backend proxy handles routing, auth, and rate limits. Just install and go.