CI / CD
Automate evaluations in your CI pipeline to catch regressions before they reach production.
Generate a GitHub Actions workflow
eqho-eval ci
eqho-eval ci --threshold 80 # require 80%+ pass rate
eqho-eval ci -o .github/workflows # output to workflows directory
This generates a workflow file that:
- Installs
eqho-evalandpromptfoo - Authenticates using a repository secret
- Runs evaluations
- Fails the build if the pass rate drops below the threshold
Manual workflow setup
# .github/workflows/eval.yml
name: Agent Evaluation
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
eval:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '22'
- name: Install tools
run: |
npm i -g eqho-eval promptfoo
- name: Authenticate
run: eqho-eval auth --key ${{ secrets.EQHO_API_KEY }}
- name: Run evaluation
run: |
cd my-eval
eqho-eval eval --no-cache --json > results.json
- name: Check pass rate
run: |
PASS_RATE=$(cat results.json | jq '.results.stats.successes / (.results.stats.successes + .results.stats.failures) * 100')
echo "Pass rate: ${PASS_RATE}%"
if (( $(echo "$PASS_RATE < 80" | bc -l) )); then
echo "::error::Pass rate ${PASS_RATE}% is below 80% threshold"
exit 1
fi
Repository secrets
Add your Eqho API key as a GitHub repository secret:
- Go to Settings → Secrets and variables → Actions
- Click New repository secret
- Name:
EQHO_API_KEY - Value: your Eqho API key
Non-interactive mode
For CI environments, use --yes and environment variables to avoid interactive prompts:
export EQHO_API_KEY=your-key
eqho-eval start --yes --campaign <id>
Or if the project is already scaffolded:
eqho-eval auth --key $EQHO_API_KEY
eqho-eval eval --no-cache
Comparing results across runs
Use eqho-eval diff to compare evaluation results between branches or runs:
eqho-eval diff baseline-results.json pr-results.json --threshold 5
This highlights any test cases that changed from pass to fail (or vice versa) and flags regressions above the threshold.
Caching considerations
- Use
--no-cachein CI to ensure fresh results every run - In local development, caching speeds up iteration significantly
- Run
eqho-eval cache clearto reset the cache if you suspect stale results