Architecture patterns¶
What the accelerator ships today, and the shapes a partner can reasonably customise into.
Flagship topology (what azd up deploys)¶
Pattern — supervisor routing, 4 specialists (grouped in 2 workstreams), retrieval-backed, HITL for side-effects.
user → POST /research/stream
│
▼
┌─────────────────────┐
│ Supervisor agent │ (accel-sales-research-supervisor)
└─────────────────────┘
│ plans + routes
├───────────────────────────────────────────┐
▼ ▼
┌─────────────────────┐ ┌───────────────────────┐
│ Account Researcher │ │ ICP Fit Analyst │
│ Competitive Context │ │ Outreach Personaliser │
└─────────────────────┘ └───────────────────────┘
│ │
└──────────────── grounded via ─────────────┘
│
▼
┌────────────────────────────┐
│ Azure AI Search (accounts) │ <-- seeded at postprovision
│ Web search (allow-listed) │
└────────────────────────────┘
Code pointers:
- Workflow:
src/scenarios/sales_research/workflow.py - Agents (prompt / transform / validate trio):
src/scenarios/sales_research/agents/<agent>/ - Retrieval:
src/retrieval/ai_search.py+ scenario index schema insrc/scenarios/sales_research/retrieval.py - Endpoint binding:
src/main.pyreadsaccelerator.yamland mounts/research/streamviasrc.workflow.registry.load_scenario - Tools (side-effect, HITL-gated):
src/tools/
See Flagship implementation detail at the bottom for the full agent graph (including the aggregator + HITL gate), Azure topology, and step-by-step control flow.
Frontend layer (partner-built)¶
The accelerator stops at the API. The UI a customer actually clicks is the partner's value-add — bespoke UX, auth, branding, and any approval surfaces are out of scope for the template.
To shorten the runway, a reference UI starter ships at
patterns/sales-research-frontend/:
React + Vite + TypeScript, consumes POST /research/stream directly via the
Fetch + ReadableStream SSE pattern, deploys to Azure Static Web Apps. It is
deliberately minimal (no auth, no state persistence, no framework lock-in)
so partners can fork-and-customise rather than rip out opinions.
The pattern is not built or tested in CI — it's reference material the partner lifts into their own pipeline once they've customised it.
Invariants the lint enforces¶
scripts/accelerator-lint.py is the authoritative contract. Partners must not violate:
| Invariant | Rule |
|---|---|
Agent instructions authored in docs/agent-specs/*.md, synced to Foundry at startup |
agent_specs_no_hardcoded_model, spec + prompt modules reference by Foundry agent name |
accelerator.yaml resolves to importable modules and has well-formed retrieval schema refs |
scenario_manifest_valid |
| Every side-effect tool has a HITL checkpoint | side_effect_tools_call_hitl |
| Content filter declared in Bicep (not portal) | bicep_has_content_filter |
| No key-based secrets in code; managed identity only; Key Vault uses RBAC | no_hardcoded_secrets, uses_default_azure_credential, key_vault_rbac_only |
| SDK matrix cannot drift from GA | sdks_pinned_to_ga, dockerfile_matches_ga_pins |
| Container image ships the manifest | dockerfile_copies_manifest |
Run python scripts/accelerator-lint.py locally; CI runs the same thing.
Customizing models per agent¶
The accelerator provisions a single default model deployment out of the box (gpt-5-mini). To give individual agents a different model — e.g. put the supervisor on gpt-5 while workers stay on gpt-5-mini for speed and cost — add a models: block to accelerator.yaml:
models:
- slug: default # reserved slug; must be the default entry
deployment_name: gpt-5-mini
model: gpt-5-mini
version: "2025-08-07"
capacity: 30
default: true
- slug: planner # arbitrary slug; agents reference by slug
deployment_name: gpt-5-planner
model: gpt-5
version: "2025-08-07"
capacity: 10
scenario:
agents:
- { id: supervisor, foundry_name: accel-sales-research-supervisor, model: planner }
- { id: account_planner, foundry_name: accel-account-planner } # no `model:` → default
How it flows:
- Bicep parses
accelerator.yamlat compile time vialoadYamlContentininfra/main.bicep, splits themodels:block into a default entry plus extras, and provisions each ininfra/modules/foundry.bicep(@batchSize(1)— one at a time to avoid Foundry capacity-queue rejections). All chat / generative model deployments are bound to the shared RAI (content filter) policyaccelerator-default-policy; the embedding deployment is provisioned without a RAI binding (content filtering doesn't apply to embeddings — they don't generate). OutputAZURE_AI_FOUNDRY_MODEL_MAPis aslug -> deployment_nameobject, passed into the Container App as theAZURE_AI_FOUNDRY_MODEL_MAPenv var. - FastAPI startup (
src/bootstrap.py) reads the model map and resolves each Foundry agent'sscenario.agents[].modelslug (ordefaultwhen omitted) before callingcreate_or_updateagainst Foundry. The bootstrap is idempotent — restarts re-converge to the manifest.
Lint enforcement (both BLOCKING):
models_block_shape— every entry has slug/deployment_name/model/version/capacity; slugs and deployment names are unique; exactly one entry hasdefault: trueand usesslug: default(reserved).agent_model_refs_exist— everyscenario.agents[].modelreferences a declared slug; omitting the field falls through to slugdefault.
Omitting the whole models: block is supported: infra/main.bicep falls back to a built-in default (gpt-5-mini / 2025-08-07 / capacity 30) — the same end state whether the block was ever there or not. Partners who want a different default deployment MUST use the models: block with a single default entry.
Variations a partner can choose¶
These are stub scaffolds + chatmode-driven walkthroughs — not drop-in packages. Each variant ships a minimal source file under patterns/<variant>/src/ and a /switch-to-variant <variant> chatmode that copies it over src/main.py, prunes the flagship workers, and updates accelerator.yaml.solution.pattern. Partner finishes the re-authoring under src/scenarios/<new-id>/. See .github/chatmodes/switch-to-variant.chatmode.md for guidance.
| Variant | Status today | When to reach for it |
|---|---|---|
| Supervisor routing | Shipped (default). Flagship under src/scenarios/sales_research/. Lint, evals, and bootstrap all assume this shape. |
Research / multi-facet briefing. Default. |
| Single-agent retrieval | Stub + walkthrough at patterns/single-agent/ (one src/agent.py + chatmode). Pages publishes the walkthrough only on GitHub, not in the docs site. |
Doc Q&A, policy lookup — one agent + retrieval, no side-effects. |
| Chat-with-actioning | Stub + walkthrough at patterns/chat-with-actioning/ (one src/chat.py + chatmode). HITL gating from the flagship still applies to every side-effect. |
Conversational UX over multi-turn tool use. |
Raising past 3–5 coordinated workers is out of scope for v1 — evaluation surface and HITL coordination get unmanageable. Split into separate engagements.
Anti-patterns the lint or reviewers reject¶
| Anti-pattern | Why |
|---|---|
| Agent instructions in Python source | Authoring source is docs/agent-specs/*.md; src/bootstrap.py syncs them to Foundry at startup. Code references agents by ID only. |
| Side-effect tool without HITL checkpoint | HITL is a hard invariant for every side_effect_tools entry in accelerator.yaml. |
Inline SDK pins in src/Dockerfile |
Source of truth is pyproject.toml + ga-versions.yaml; lint blocks. |
| Direct model SDK calls bypassing telemetry | Go through agent_framework clients so OTel + cost tracking stay live. |
| Grounding source without declared ACL posture | Declare it in accelerator.yaml or the scenario retrieval schema. |
Out of scope for v1¶
- Sovereign cloud deployments
- Multi-tenant control plane
- Cryptographic attestation of partner-customised repos (community support model)
- Terraform first-class (BYO-IaC is fine; match the Bicep contracts)
- .NET / Java runtime parity (roadmap item)
- More than 5 coordinated agents
Flagship implementation detail¶
Deeper walk of how the flagship scenario actually runs end-to-end in the customer's subscription. This is how the supervisor-routing pattern materialises in src/scenarios/sales_research/ and the Bicep under infra/.
Agent graph (flagship, as shipped)¶
┌───────────────────────────────────┐
│ Supervisor agent │
│ (intent + parallel routing) │
└───────────────────────────────────┘
│ │ │ │
┌────────┘ ┌────┘ ┌────┘ ┌────┘
▼ ▼ ▼ ▼
┌──────────┐ ┌───────────┐ ┌────────────┐ ┌─────────────────┐
│ Account │ │ ICP / Fit │ │Competitive │ │ Outreach │
│Researcher│ │ Analyst │ │ Context │ │ Personalizer │
└──────────┘ └───────────┘ └────────────┘ └─────────────────┘
│ │ │ │
└────────────┴───────────┴───────────────────┘
▼
┌───────────────┐
│ Aggregator │ assembles sales brief + outreach draft
└───────────────┘
▼
┌──────────────────────────┐
│ HITL gate: CRM write / │
│ email send │
└──────────────────────────┘
▼
(crm_write_contact, send_email)
Azure topology (deployed by azd up)¶
┌─────────────────────── Customer subscription ──────────────────────┐
│ │
│ Container Apps (API + worker) │
│ ├── Managed Identity ─────────────────────────┐ │
│ ├── uses DefaultAzureCredential │ │
│ └── emits OpenTelemetry → App Insights │ │
│ │ │
│ Microsoft Foundry project ◀──────────────────────┤ │
│ agents retrieved by name; instructions in portal │
│ content filters applied via IaC │
│ │ │
│ Azure AI Search ◀───────────────────────────────┤ │
│ index: accounts, past-touches, competitive │
│ │ │
│ Azure Key Vault ◀───────────────────────────────┤ │
│ secrets for CRM, SMTP, misc external systems │
│ │ │
│ Application Insights + Log Analytics │
│ dashboards: roi-kpis, hitl-activity, cost-attribution │
│ │
│ (optional) Private Endpoints — required for regulated workloads │
└────────────────────────────────────────────────────────────────────┘
Data + control flow¶
- Partner or customer invokes the API (
src/main.py). src/scenarios/sales_research/workflow.pyexecutes the supervisor with the request.- Supervisor classifies intent and dispatches to workers (parallel).
- Workers ground via
retrieval/ai_search.pyand tools; return structured outputs. - Aggregator composes the final brief + outreach draft.
- Any side-effect tool (CRM write, email send) passes
accelerator_baseline/hitl.checkpointfirst. - Telemetry events emitted to App Insights; KPI events drive ROI dashboards.
Why this shape¶
- Parallel specialists reduce latency vs serial single-agent prompting.
- Aggregator as executor keeps composition logic out of worker prompts (easier to eval).
- HITL at the edge (not inside workers) means we can audit and change policy without retraining prompts.
- Foundry-managed instructions let non-engineers refine agent behavior via portal.