Skip to content

10. Operate (Day 2)

Step 10 of 10 ยท Deliver to a customer

Step at a glance

๐ŸŽฏ Goal โ€” Monthly KPI review, alert tuning, drift checks, regression evals against main. The accelerator runs in production; this step is what keeps it healthy.

๐Ÿ“‹ Prerequisite โ€” 9. UAT & handover complete โ€” customer ops has the packet; production is live.

๐Ÿ’ป Where you'll work โ€” App Insights + GitHub Actions (scheduled evals) + the customer's PR review surface.

โœ… Done when โ€” First monthly KPI review held; first alert tuned; first regression-eval run green on main. After that, this is a recurring loop, not a one-shot step.


This page is the generic Day-2 reference. The engagement-specific handover packet supersedes it for any customer that has one (see 9. UAT & handover).

What runs on its own

After azd up the accelerator emits and gates without partner intervention:

  • Telemetry โ€” every typed event declared in src/accelerator_baseline/telemetry.py flows into App Insights via OpenTelemetry. KPI events are dashboard-wired.
  • Content filters โ€” Bicep-attached accelerator-default-policy blocks Medium+ on Hate / Sexual / Violence / Self-harm. Drift in the portal is overwritten on next azd provision.
  • Post-deploy regression evals โ€” .github/workflows/post-deploy-eval.yml runs the quality + redteam suites on the deployed environment after every merge to main.
  • HITL gates โ€” every side-effect tool routes through checkpoint(...). Failure to reach HITL_APPROVER_ENDPOINT is fail-closed.

What customer ops owns

Loop Cadence What
KPI review Monthly Pull dashboard panels declared in accelerator.yaml -> kpis. Compare against the brief's hypothesis numbers and the prior month. Flag drift to the partner team.
Alert tuning As needed Latency, error rate, eval-suite drift. Adjust thresholds based on observed baselines after the first 30 days.
Regression evals Per release + nightly Confirm evals/quality/ and evals/redteam/ are green on main against the production API URL.
Secret rotation Per partner-practice schedule AZURE_CLIENT_ID federated cred (Entra), HITL_APPROVER_ENDPOINT if the approver moves.
Model swap When a new model is qualified Edit accelerator.yaml -> models[], re-run azd up. The lint rules models_block_shape + agent_model_refs_exist block malformed manifests at PR time.
Killswitch drills Quarterly Practice flipping KILLSWITCH=1 in a non-prod environment; confirm the API returns 503 cleanly and the alert fires.

When something breaks

  1. Open App Insights. Filter on severityLevel >= 3 for the failing time window.
  2. Find the trace. Each end-to-end request emits a trace with the supervisor decision record + every worker invocation + every tool call (with HITL outcome).
  3. Check the lint + eval status on main. If the post-deploy regression suite is red, that's where the regression entered.
  4. Roll back if needed. azd deploy against a tagged commit; document in the packet's rollback section.
  5. File a PR with the fix. PR-gated CI (lint + quality evals + redteam) blocks merge until green.

Looping back

When the customer asks for a new capability:

  • Small additions (a new tool, a new worker, a model swap) โ€” back to 8. Iterate & evaluate.
  • A new business scenario โ€” back to 5. Discover with the customer for that scenario, then through scaffold โ†’ provision โ†’ iterate โ†’ UAT โ†’ handover. Multiple scenarios coexist under src/scenarios/<id>/.

The detailed runbook content (model swap procedures, secret rotation walkthroughs, killswitch drills, full alert reference) lives in the legacy customer runbook, which remains the deep reference under Reference โ†’ Delivery context.


End of walkthrough. For the next engagement, Track 1 (Get ready) stays done โ€” return directly to 4. Clone for the customer and run Track 2 with the new customer's short-name.