Eqho × promptfoo

Evaluate your Eqho agents
before they go live.

Test prompts, dispositions, tool calls, and agent actions against real campaigns. Powered by promptfoo and routed through the Vercel AI Gateway.

Get started
$npm i -g eqho-eval
eqho-eval — zsh — 120×36
~ $ eqho-eval

   ++++++++++++++++++++++++++++++++++++++++++++   
 ++++++++++++++++++++++++++++++++++++++++++++++++ 
 +++++++++++++++++++++++++++++++++++++++++++++++++
 +++++++++++++++++++++++++++++++++++++++++++++++++
 ++++++++++++++++++++++++++++++++++++++++++++++++ 
  ++++++++++++++++++++++++++++++++++++++++++++++  

        ++++++++++++++++++++++++++++++++          
      ++++++++++++++++++++++++++++++++++++        
     ++++++++++++++++++++++++++++++++++++++       
     +++++++++++++++++++++++++++++++++++++++      
     ++++++++++++++++++++++++++++++++++++++       
      ++++++++++++++++++++++++++++++++++++        
        ++++++++++++++++++++++++++++++++          

                    ###########################   
                   #############################  
                  ############################### 
                  ############################### 
                   #############################  
                     #########################    

  eqho-eval v0.5.0  Eqho × promptfoo

~ $ eqho-eval auth --key ••••••••-••••-••••-••••-••••••••b866
✔ Authenticated with Eqho API
✔ Registered with backend (model proxy enabled)
Config saved to ~/.eqho-eval/config.json

~ $ eqho-eval init --campaign 6893e79f
✔ Project initialized for Kyles Eqho Appointment Setter
  Generated: promptfooconfig.yaml, prompts/sophia.json, tools/sophia.json

~ $ eqho-eval eval --no-cache
[Evaluation] ✓ Complete! 27/27 tests in 42s

Providers:
  openai:chat:o4-mini:      29,983 tokens (9 requests)
  openai:chat:gpt-4.1:      27,284 tokens (9 requests)
  openai:chat:gpt-4.1-mini: 27,217 tokens (9 requests)

Results: ✓ 15 passed, ✗ 12 failed, 0 errors (55.56%)
Duration: 42s (concurrency: 4)

~ $ 

One command setup

Point at any Eqho campaign and scaffold a full eval suite. Prompts, tools, and test cases generated automatically.

🔀

Multi-provider

Compare GPT-4.1, o4-mini, Claude, Gemini — all through a single proxy. No API keys to juggle.

🛡️

Prompt & tool testing

Test greetings, refusals, injection resistance, and tool call schemas. Red-team your agents before production.

📊

Usage tracking

Every token routed through Vercel AI Gateway. Per-user consumption, per-provider cost breakdown.