Architecture patterns¶

What the accelerator ships today, and the shapes a partner can reasonably customise into.

Flagship topology (what `azd up` deploys)¶

Pattern — supervisor routing, 4 specialists (grouped in 2 workstreams), retrieval-backed, HITL for side-effects.

  user → POST /research/stream
            │
            ▼
   ┌─────────────────────┐
   │ Supervisor agent    │  (accel-sales-research-supervisor)
   └─────────────────────┘
            │   plans + routes
            ├───────────────────────────────────────────┐
            ▼                                           ▼
   ┌─────────────────────┐                 ┌───────────────────────┐
   │ Account Researcher  │                 │ ICP Fit Analyst       │
   │ Competitive Context │                 │ Outreach Personaliser │
   └─────────────────────┘                 └───────────────────────┘
            │                                           │
            └──────────────── grounded via ─────────────┘
                                 │
                                 ▼
                   ┌────────────────────────────┐
                   │ Azure AI Search (accounts) │   <-- seeded at postprovision
                   │ Web search (allow-listed)  │
                   └────────────────────────────┘

Code pointers:

Workflow: src/scenarios/sales_research/workflow.py
Agents (prompt / transform / validate trio): src/scenarios/sales_research/agents/<agent>/
Retrieval: src/retrieval/ai_search.py + scenario index schema in src/scenarios/sales_research/retrieval.py
Endpoint binding: src/main.py reads accelerator.yaml and mounts /research/stream via src.workflow.registry.load_scenario
Tools (side-effect, HITL-gated): src/tools/

See Flagship implementation detail at the bottom for the full agent graph (including the aggregator + HITL gate), Azure topology, and step-by-step control flow.

Frontend layer (partner-built)¶

The accelerator stops at the API. The UI a customer actually clicks is the partner's value-add — bespoke UX, auth, branding, and any approval surfaces are out of scope for the template.

To shorten the runway, a reference UI starter ships at patterns/sales-research-frontend/: React + Vite + TypeScript, consumes POST /research/stream directly via the Fetch + ReadableStream SSE pattern, deploys to Azure Static Web Apps. It is deliberately minimal (no auth, no state persistence, no framework lock-in) so partners can fork-and-customise rather than rip out opinions.

   browser  ──HTTPS──►  Static Web App  ──HTTPS──►  Container Apps (FastAPI)
                                                         │
                                                         └─► Foundry / Search / KV

The pattern is not built or tested in CI — it's reference material the partner lifts into their own pipeline once they've customised it.

Invariants the lint enforces¶

scripts/accelerator-lint.py is the authoritative contract. Partners must not violate:

Invariant	Rule
Agent instructions authored in `docs/agent-specs/*.md`, synced to Foundry at startup	`agent_specs_no_hardcoded_model`, spec + prompt modules reference by Foundry agent name
`accelerator.yaml` resolves to importable modules and has well-formed retrieval schema refs	`scenario_manifest_valid`
Every side-effect tool has a HITL checkpoint	`side_effect_tools_call_hitl`
Content filter declared in Bicep (not portal)	`bicep_has_content_filter`
No key-based secrets in code; managed identity only; Key Vault uses RBAC	`no_hardcoded_secrets`, `uses_default_azure_credential`, `key_vault_rbac_only`
SDK matrix cannot drift from GA	`sdks_pinned_to_ga`, `dockerfile_matches_ga_pins`
Container image ships the manifest	`dockerfile_copies_manifest`

Run python scripts/accelerator-lint.py locally; CI runs the same thing.

Customizing models per agent¶

The accelerator provisions a single default model deployment out of the box (gpt-5-mini). To give individual agents a different model — e.g. put the supervisor on gpt-5 while workers stay on gpt-5-mini for speed and cost — add a models: block to accelerator.yaml:

models:
  - slug: default                 # reserved slug; must be the default entry
    deployment_name: gpt-5-mini
    model: gpt-5-mini
    version: "2025-08-07"
    capacity: 30
    default: true
  - slug: planner                 # arbitrary slug; agents reference by slug
    deployment_name: gpt-5-planner
    model: gpt-5
    version: "2025-08-07"
    capacity: 10

scenario:
  agents:
    - { id: supervisor, foundry_name: accel-sales-research-supervisor, model: planner }
    - { id: account_planner, foundry_name: accel-account-planner }   # no `model:` → default

How it flows:

Bicep parses accelerator.yaml at compile time via loadYamlContent in infra/main.bicep, splits the models: block into a default entry plus extras, and provisions each in infra/modules/foundry.bicep (@batchSize(1) — one at a time to avoid Foundry capacity-queue rejections). All chat / generative model deployments are bound to the shared RAI (content filter) policy accelerator-default-policy; the embedding deployment is provisioned without a RAI binding (content filtering doesn't apply to embeddings — they don't generate). Output AZURE_AI_FOUNDRY_MODEL_MAP is a slug -> deployment_name object, passed into the Container App as the AZURE_AI_FOUNDRY_MODEL_MAP env var.
FastAPI startup (src/bootstrap.py) reads the model map and resolves each Foundry agent's scenario.agents[].model slug (or default when omitted) before calling create_or_update against Foundry. The bootstrap is idempotent — restarts re-converge to the manifest.

Lint enforcement (both BLOCKING):

models_block_shape — every entry has slug/deployment_name/model/version/capacity; slugs and deployment names are unique; exactly one entry has default: true and uses slug: default (reserved).
agent_model_refs_exist — every scenario.agents[].model references a declared slug; omitting the field falls through to slug default.

Omitting the whole models: block is supported: infra/main.bicep falls back to a built-in default (gpt-5-mini / 2025-08-07 / capacity 30) — the same end state whether the block was ever there or not. Partners who want a different default deployment MUST use the models: block with a single default entry.

Variations a partner can choose¶

These are stub scaffolds + chatmode-driven walkthroughs — not drop-in packages. Each variant ships a minimal source file under patterns/<variant>/src/ and a /switch-to-variant <variant> chatmode that copies it over src/main.py, prunes the flagship workers, and updates accelerator.yaml.solution.pattern. Partner finishes the re-authoring under src/scenarios/<new-id>/. See .github/chatmodes/switch-to-variant.chatmode.md for guidance.

Variant	Status today	When to reach for it
Supervisor routing	Shipped (default). Flagship under `src/scenarios/sales_research/`. Lint, evals, and bootstrap all assume this shape.	Research / multi-facet briefing. Default.
Single-agent retrieval	Stub + walkthrough at `patterns/single-agent/` (one `src/agent.py` + chatmode). Pages publishes the walkthrough only on GitHub, not in the docs site.	Doc Q&A, policy lookup — one agent + retrieval, no side-effects.
Chat-with-actioning	Stub + walkthrough at `patterns/chat-with-actioning/` (one `src/chat.py` + chatmode). HITL gating from the flagship still applies to every side-effect.	Conversational UX over multi-turn tool use.

Raising past 3–5 coordinated workers is out of scope for v1 — evaluation surface and HITL coordination get unmanageable. Split into separate engagements.

Anti-patterns the lint or reviewers reject¶

Anti-pattern	Why
Agent instructions in Python source	Authoring source is `docs/agent-specs/*.md`; `src/bootstrap.py` syncs them to Foundry at startup. Code references agents by ID only.
Side-effect tool without HITL checkpoint	HITL is a hard invariant for every `side_effect_tools` entry in `accelerator.yaml`.
Inline SDK pins in `src/Dockerfile`	Source of truth is `pyproject.toml` + `ga-versions.yaml`; lint blocks.
Direct model SDK calls bypassing telemetry	Go through `agent_framework` clients so OTel + cost tracking stay live.
Grounding source without declared ACL posture	Declare it in `accelerator.yaml` or the scenario retrieval schema.

Out of scope for v1¶

Sovereign cloud deployments
Multi-tenant control plane
Cryptographic attestation of partner-customised repos (community support model)
Terraform first-class (BYO-IaC is fine; match the Bicep contracts)
.NET / Java runtime parity (roadmap item)
More than 5 coordinated agents

Flagship implementation detail¶

Deeper walk of how the flagship scenario actually runs end-to-end in the customer's subscription. This is how the supervisor-routing pattern materialises in src/scenarios/sales_research/ and the Bicep under infra/.

Agent graph (flagship, as shipped)¶

            ┌───────────────────────────────────┐
            │         Supervisor agent          │
            │  (intent + parallel routing)      │
            └───────────────────────────────────┘
              │        │          │          │
     ┌────────┘   ┌────┘     ┌────┘     ┌────┘
     ▼            ▼           ▼           ▼
┌──────────┐ ┌───────────┐ ┌────────────┐ ┌─────────────────┐
│ Account  │ │ ICP / Fit │ │Competitive │ │     Outreach    │
│Researcher│ │  Analyst  │ │  Context   │ │   Personalizer  │
└──────────┘ └───────────┘ └────────────┘ └─────────────────┘
     │            │           │                   │
     └────────────┴───────────┴───────────────────┘
                          ▼
                  ┌───────────────┐
                  │  Aggregator   │  assembles sales brief + outreach draft
                  └───────────────┘
                          ▼
             ┌──────────────────────────┐
             │ HITL gate: CRM write /   │
             │           email send     │
             └──────────────────────────┘
                          ▼
            (crm_write_contact, send_email)

Azure topology (deployed by `azd up`)¶

┌─────────────────────── Customer subscription ──────────────────────┐
│                                                                    │
│   Container Apps (API + worker)                                    │
│     ├── Managed Identity ─────────────────────────┐                │
│     ├── uses DefaultAzureCredential               │                │
│     └── emits OpenTelemetry → App Insights        │                │
│                                                   │                │
│   Microsoft Foundry project ◀──────────────────────┤                │
│     agents retrieved by name; instructions in portal               │
│     content filters applied via IaC                                │
│                                                   │                │
│   Azure AI Search ◀───────────────────────────────┤                │
│     index: accounts, past-touches, competitive                     │
│                                                   │                │
│   Azure Key Vault ◀───────────────────────────────┤                │
│     secrets for CRM, SMTP, misc external systems                   │
│                                                   │                │
│   Application Insights + Log Analytics                             │
│     dashboards: roi-kpis, hitl-activity, cost-attribution          │
│                                                                    │
│   (optional) Private Endpoints — required for regulated workloads  │
└────────────────────────────────────────────────────────────────────┘

Data + control flow¶

Partner or customer invokes the API (src/main.py).
src/scenarios/sales_research/workflow.py executes the supervisor with the request.
Supervisor classifies intent and dispatches to workers (parallel).
Workers ground via retrieval/ai_search.py and tools; return structured outputs.
Aggregator composes the final brief + outreach draft.
Any side-effect tool (CRM write, email send) passes accelerator_baseline/hitl.checkpoint first.
Telemetry events emitted to App Insights; KPI events drive ROI dashboards.

Why this shape¶

Parallel specialists reduce latency vs serial single-agent prompting.
Aggregator as executor keeps composition logic out of worker prompts (easier to eval).
HITL at the edge (not inside workers) means we can audit and change policy without retraining prompts.
Foundry-managed instructions let non-engineers refine agent behavior via portal.