API Reference
API Reference¶
Interactive API documentation generated from the OpenAPI schema. This provides the definitive reference for all REST endpoints, WebSocket connections, authentication, and configuration.
Interactive Documentation¶
Real-Time Voice Agent API 1.0.0¶
Real-Time Voice Agent API
Contact: Real-Time Voice Agent Team support@example.com
License: MIT License
health¶
GET /api/v1/health¶
Basic Health Check
Description Basic health check endpoint that returns 200 if the server is running. Used by load balancers for liveness checks.
Response 200 OK¶
application/json
{
"status": "healthy",
"version": "1.0.0",
"timestamp": 1691668800.0,
"message": "Real-Time Audio Agent API v1 is running",
"details": {
"api_version": "v1",
"service": "rtagent-backend"
}
}
Schema of the response body
{
"properties": {
"status": {
"type": "string",
"title": "Status",
"description": "Overall health status",
"example": "healthy"
},
"version": {
"type": "string",
"title": "Version",
"description": "API version",
"default": "1.0.0",
"example": "1.0.0"
},
"timestamp": {
"type": "number",
"title": "Timestamp",
"description": "Timestamp when check was performed",
"example": 1691668800.0
},
"message": {
"type": "string",
"title": "Message",
"description": "Human-readable status message",
"example": "Real-Time Audio Agent API v1 is running"
},
"details": {
"additionalProperties": true,
"type": "object",
"title": "Details",
"description": "Additional health details",
"example": {
"api_version": "v1",
"service": "rtagent-backend"
}
},
"active_sessions": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Active Sessions",
"description": "Current number of active realtime conversation sessions (None if unavailable)",
"example": 3
},
"session_metrics": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"title": "Session Metrics",
"description": "Optional granular session metrics (connected/disconnected, etc.)",
"example": {
"active": 3,
"connected": 5,
"disconnected": 2
}
}
},
"type": "object",
"required": [
"status",
"timestamp",
"message"
],
"title": "HealthResponse",
"description": "Health check response model.",
"example": {
"active_sessions": 3,
"details": {
"api_version": "v1",
"service": "rtagent-backend"
},
"message": "Real-Time Audio Agent API v1 is running",
"session_metrics": {
"active": 3,
"connected": 5,
"disconnected": 2
},
"status": "healthy",
"timestamp": 1691668800.0,
"version": "1.0.0"
}
}
GET /api/v1/readiness¶
Comprehensive Readiness Check
Description Comprehensive readiness probe that checks all critical dependencies with timeouts.
This endpoint verifies:
- Redis connectivity and performance
- Azure OpenAI client health
- Speech services (TTS/STT) availability
- ACS caller configuration and connectivity
- RT Agents initialization
- Authentication configuration (when ENABLE_AUTH_VALIDATION=True)
- Event system health
When authentication validation is enabled, checks:
- BACKEND_AUTH_CLIENT_ID is set and is a valid GUID
- AZURE_TENANT_ID is set and is a valid GUID
- ALLOWED_CLIENT_IDS contains at least one valid GUID
Returns 503 if any critical services are unhealthy, 200 if all systems are
ready.
Response 200 OK¶
application/json
{
"status": "ready",
"timestamp": 1691668800.0,
"response_time_ms": 45.2,
"checks": [
{
"component": "redis",
"status": "healthy",
"check_time_ms": 12.5,
"details": "Connected to Redis successfully"
},
{
"component": "auth_configuration",
"status": "healthy",
"check_time_ms": 1.2,
"details": "Auth validation enabled with 2 allowed client(s)"
}
],
"event_system": {
"is_healthy": true,
"handlers_count": 7,
"domains_count": 2
}
}
Schema of the response body
{
"properties": {
"status": {
"type": "string",
"enum": [
"ready",
"not_ready",
"degraded"
],
"title": "Status",
"description": "Overall readiness status",
"example": "ready"
},
"timestamp": {
"type": "number",
"title": "Timestamp",
"description": "Timestamp when check was performed",
"example": 1691668800.0
},
"response_time_ms": {
"type": "number",
"title": "Response Time Ms",
"description": "Total time taken for all checks in milliseconds",
"example": 45.2
},
"checks": {
"items": {
"$ref": "#/components/schemas/ServiceCheck"
},
"type": "array",
"title": "Checks",
"description": "Individual component health checks"
},
"event_system": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"title": "Event System",
"description": "Event system status information",
"example": {
"domains_count": 2,
"handlers_count": 7,
"is_healthy": true
}
}
},
"type": "object",
"required": [
"status",
"timestamp",
"response_time_ms",
"checks"
],
"title": "ReadinessResponse",
"description": "Comprehensive readiness check response model.",
"example": {
"checks": [
{
"check_time_ms": 12.5,
"component": "redis",
"details": "Connected to Redis successfully",
"status": "healthy"
},
{
"check_time_ms": 8.3,
"component": "azure_openai",
"details": "Client initialized",
"status": "healthy"
}
],
"event_system": {
"domains_count": 2,
"handlers_count": 7,
"is_healthy": true
},
"response_time_ms": 45.2,
"status": "ready",
"timestamp": 1691668800.0
}
}
Response 503 Service Unavailable¶
application/json
{
"status": "not_ready",
"timestamp": 1691668800.0,
"response_time_ms": 1250.0,
"checks": [
{
"component": "redis",
"status": "unhealthy",
"check_time_ms": 1000.0,
"error": "Connection timeout"
},
{
"component": "auth_configuration",
"status": "unhealthy",
"check_time_ms": 2.1,
"error": "BACKEND_AUTH_CLIENT_ID is not a valid GUID"
}
]
}
Schema of the response body
GET /api/v1/agents¶
Get Agents Info
Description Get information about loaded RT agents including their configuration, model settings, and voice settings that can be modified.
Response 200 OK¶
application/json
Schema of the response body
PUT /api/v1/agents/{agent_name}¶
Update Agent Config
Description Update configuration for a specific agent (model settings, voice, etc.). Changes are applied to the runtime instance but not persisted to YAML files.
Input parameters
| Parameter | In | Type | Default | Nullable | Description |
|---|---|---|---|---|---|
| agent_name | path | string | No |
Request body
application/json
This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.Schema of the request body
{
"properties": {
"model": {
"anyOf": [
{
"$ref": "#/components/schemas/AgentModelUpdate"
},
{
"type": "null"
}
]
},
"voice": {
"anyOf": [
{
"$ref": "#/components/schemas/AgentVoiceUpdate"
},
{
"type": "null"
}
]
}
},
"type": "object",
"title": "AgentConfigUpdate"
}
Response 200 OK¶
application/json
Schema of the response body
Response 422 Unprocessable Entity¶
application/json
This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.Schema of the response body
{
"properties": {
"detail": {
"items": {
"$ref": "#/components/schemas/ValidationError"
},
"type": "array",
"title": "Detail"
}
},
"type": "object",
"title": "HTTPValidationError"
}
Call Management¶
POST /api/v1/calls/initiate¶
Initiate Outbound Call
Description Initiate a new outbound call to the specified phone number.
This endpoint:
- Validates the phone number format
- Generates a unique call ID
- Emits a call initiation event through the V1 event system
- Returns immediately with call status
The actual call establishment is handled asynchronously through Azure
Communication Services.
Request body
application/json
{
"caller_id": "+1987654321",
"context": {
"customer_id": "cust_12345",
"department": "support"
},
"target_number": "+1234567890"
}
Schema of the request body
{
"properties": {
"target_number": {
"type": "string",
"pattern": "^\\+[1-9]\\d{1,14}$",
"title": "Target Number",
"description": "Phone number to call in E.164 format (e.g., +1234567890)",
"example": "+1234567890"
},
"caller_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Caller Id",
"description": "Caller ID to display (optional, uses system default if not provided)",
"example": "+1987654321"
},
"context": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"title": "Context",
"description": "Additional call context metadata",
"example": {
"customer_id": "cust_12345",
"department": "support",
"priority": "high",
"source": "web_portal"
}
}
},
"type": "object",
"required": [
"target_number"
],
"title": "CallInitiateRequest",
"description": "Request model for initiating a call.",
"example": {
"caller_id": "+1987654321",
"context": {
"customer_id": "cust_12345",
"department": "support"
},
"target_number": "+1234567890"
}
}
Response 200 OK¶
application/json
{
"call_id": "call_abc12345",
"status": "initiating",
"target_number": "+1234567890",
"message": "Call initiation requested for +1234567890"
}
Schema of the response body
{
"properties": {
"call_id": {
"type": "string",
"title": "Call Id",
"description": "Unique call identifier",
"example": "call_abc12345"
},
"status": {
"type": "string",
"title": "Status",
"description": "Current call status",
"example": "initiating"
},
"target_number": {
"type": "string",
"title": "Target Number",
"description": "Target phone number",
"example": "+1234567890"
},
"message": {
"type": "string",
"title": "Message",
"description": "Human-readable status message",
"example": "Call initiation requested"
}
},
"type": "object",
"required": [
"call_id",
"status",
"target_number",
"message"
],
"title": "CallInitiateResponse",
"description": "Response model for call initiation.",
"example": {
"call_id": "call_abc12345",
"message": "Call initiation requested for +1234567890",
"status": "initiating",
"target_number": "+1234567890"
}
}
Response 400 Bad Request¶
application/json
Schema of the response body
Response 500 Internal Server Error¶
application/json
Schema of the response body
Response 422 Unprocessable Entity¶
application/json
This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.Schema of the response body
{
"properties": {
"detail": {
"items": {
"$ref": "#/components/schemas/ValidationError"
},
"type": "array",
"title": "Detail"
}
},
"type": "object",
"title": "HTTPValidationError"
}
GET /api/v1/calls/¶
List Calls
Description Retrieve a paginated list of calls with optional filtering.
Supports:
- Pagination with page and limit parameters
- Filtering by call status
- Sorting by creation time (newest first)
Input parameters
| Parameter | In | Type | Default | Nullable | Description |
|---|---|---|---|---|---|
| limit | query | integer | 10 | No | Number of items per page (1-100) |
| page | query | integer | 1 | No | Page number (1-based) |
| status_filter | query | None | No | Filter calls by status |
Response 200 OK¶
application/json
{
"calls": [
{
"call_id": "call_abc12345",
"status": "connected",
"duration": 120,
"participants": [],
"events": []
}
],
"total": 25,
"page": 1,
"limit": 10
}
Schema of the response body
{
"properties": {
"calls": {
"items": {
"$ref": "#/components/schemas/CallStatusResponse"
},
"type": "array",
"title": "Calls",
"description": "List of calls"
},
"total": {
"type": "integer",
"title": "Total",
"description": "Total number of calls matching criteria",
"example": 25
},
"page": {
"type": "integer",
"title": "Page",
"description": "Current page number (1-based)",
"default": 1,
"example": 1
},
"limit": {
"type": "integer",
"title": "Limit",
"description": "Number of items per page",
"default": 10,
"example": 10
}
},
"type": "object",
"required": [
"calls",
"total"
],
"title": "CallListResponse",
"description": "Response model for listing calls.",
"example": {
"calls": [
{
"call_id": "call_abc12345",
"duration": 120,
"events": [],
"participants": [],
"status": "connected"
}
],
"limit": 10,
"page": 1,
"total": 25
}
}
Response 400 Bad Request¶
application/json
Schema of the response body
Response 422 Unprocessable Entity¶
application/json
This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.Schema of the response body
{
"properties": {
"detail": {
"items": {
"$ref": "#/components/schemas/ValidationError"
},
"type": "array",
"title": "Detail"
}
},
"type": "object",
"title": "HTTPValidationError"
}
POST /api/v1/calls/answer¶
Answer Inbound Call
Description Handle inbound call events and Event Grid subscription validation.
This endpoint:
- Validates Event Grid subscription requests
- Answers incoming calls automatically with orchestrator selection
- Initializes conversation state with features
- Supports pluggable conversation orchestrators
- Provides advanced tracing and monitoring
Enhanced V1 features:
- Pluggable orchestrator injection for conversation handling
- Enhanced state management with orchestrator metadata
- Advanced observability and correlation
- Production-ready error handling
Response 200 OK¶
application/json
{
"status": "call answered",
"orchestrator": "gpt_flow",
"acs_features": {
"orchestrator_support": true,
"advanced_tracing": true,
"api_version": "v1"
}
}
Schema of the response body
Response 400 Bad Request¶
application/json
Schema of the response body
Response 503 Service Unavailable¶
application/json
Schema of the response body
POST /api/v1/calls/callbacks¶
Handle ACS Callback Events
Description Handle Azure Communication Services callback events.
This endpoint receives webhooks from ACS when call events occur:
- Call connected/disconnected
- Participant joined/left
- Media events (DTMF tones, play completed, etc.)
- Transfer events
The endpoint validates authentication, processes events through the
V1 CallEventProcessor system, and returns processing results.
Response 200 OK¶
application/json
Schema of the response body
Response 500 Internal Server Error¶
application/json
Schema of the response body
Response 503 Service Unavailable¶
application/json
Schema of the response body
ACS Media Session¶
GET /api/v1/media/status¶
Get Media Streaming Status
Description Get the current status of media streaming configuration.
:return: Current media streaming configuration and status :rtype: dict
Response 200 OK¶
application/json
Schema of the response body
{
"additionalProperties": true,
"type": "object",
"title": "Response Get Media Status Api V1 Media Status Get"
}
POST /api/v1/media/sessions¶
Create Media Session
Description Create a new media streaming session for Azure Communication Services.
Initializes a media session with specified audio configuration and returns WebSocket connection details for real-time audio streaming. This endpoint prepares the infrastructure for bidirectional media communication with configurable audio parameters.
Args: request: Media session configuration including call connection ID, audio format, sample rate, and streaming options.
Returns: MediaSessionResponse: Session details containing unique session ID, WebSocket URL for streaming, status, and audio configuration.
Raises: HTTPException: When session creation fails due to invalid configuration or system resource constraints.
Example: >>> request = MediaSessionRequest(call_connection_id="call_123") >>> response = await create_media_session(request) >>> print(response.websocket_url)
Request body
application/json
{
"audio_format": "pcm_16",
"call_connection_id": "call_12345",
"channels": 1,
"chunk_size": 1024,
"enable_transcription": true,
"enable_vad": true,
"sample_rate": 16000
}
Schema of the request body
{
"properties": {
"call_connection_id": {
"type": "string",
"title": "Call Connection Id",
"description": "ACS call connection identifier",
"example": "call_12345"
},
"sample_rate": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Sample Rate",
"description": "Audio sample rate in Hz",
"default": 16000,
"example": 16000
},
"channels": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Channels",
"description": "Number of audio channels",
"default": 1,
"example": 1
},
"audio_format": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Audio Format",
"description": "Audio format (pcm_16, pcm_24, opus, etc.)",
"default": "pcm_16",
"example": "pcm_16"
},
"chunk_size": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"title": "Chunk Size",
"description": "Audio chunk size in bytes",
"default": 1024,
"example": 1024
},
"enable_transcription": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"title": "Enable Transcription",
"description": "Enable real-time transcription",
"default": true,
"example": true
},
"enable_vad": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"title": "Enable Vad",
"description": "Enable voice activity detection",
"default": true,
"example": true
}
},
"type": "object",
"required": [
"call_connection_id"
],
"title": "MediaSessionRequest",
"description": "Request schema for starting a media session.",
"example": {
"audio_format": "pcm_16",
"call_connection_id": "call_12345",
"channels": 1,
"chunk_size": 1024,
"enable_transcription": true,
"enable_vad": true,
"sample_rate": 16000
}
}
Response 200 OK¶
application/json
{
"configuration": {
"channels": 1,
"chunk_size": 1024,
"format": "pcm_16",
"sample_rate": 16000
},
"created_at": "2025-08-10T13:45:00Z",
"session_id": "media_session_123456",
"status": "active",
"websocket_url": "wss://api.example.com/v1/media/stream/media_session_123456"
}
Schema of the response body
{
"properties": {
"session_id": {
"type": "string",
"title": "Session Id",
"description": "Unique media session identifier",
"example": "media_session_123456"
},
"websocket_url": {
"type": "string",
"title": "Websocket Url",
"description": "WebSocket URL for audio streaming",
"example": "wss://api.example.com/v1/media/stream/media_session_123456"
},
"status": {
"type": "string",
"title": "Status",
"description": "Session status",
"example": "active"
},
"created_at": {
"type": "string",
"title": "Created At",
"description": "Session creation timestamp",
"example": "2025-08-10T13:45:00Z"
},
"configuration": {
"additionalProperties": true,
"type": "object",
"title": "Configuration",
"description": "Session configuration settings",
"example": {
"channels": 1,
"chunk_size": 1024,
"format": "pcm_16",
"sample_rate": 16000
}
}
},
"type": "object",
"required": [
"session_id",
"websocket_url",
"status",
"created_at",
"configuration"
],
"title": "MediaSessionResponse",
"description": "Response schema for media session creation.",
"example": {
"configuration": {
"channels": 1,
"chunk_size": 1024,
"format": "pcm_16",
"sample_rate": 16000
},
"created_at": "2025-08-10T13:45:00Z",
"session_id": "media_session_123456",
"status": "active",
"websocket_url": "wss://api.example.com/v1/media/stream/media_session_123456"
}
}
Response 422 Unprocessable Entity¶
application/json
This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.Schema of the response body
{
"properties": {
"detail": {
"items": {
"$ref": "#/components/schemas/ValidationError"
},
"type": "array",
"title": "Detail"
}
},
"type": "object",
"title": "HTTPValidationError"
}
GET /api/v1/media/sessions/{session_id}¶
Get Media Session Status
Description Retrieve status and metadata for a specific media session.
Queries the current state of an active media session including connection status, WebSocket state, and session configuration details. Used for monitoring and debugging media streaming sessions.
Args: session_id: Unique identifier for the media session to query.
Returns: dict: Session information including status, connection state, creation timestamp, and API version details.
Example: >>> session_info = await get_media_session("media_session_123") >>> print(session_info["status"])
Input parameters
| Parameter | In | Type | Default | Nullable | Description |
|---|---|---|---|---|---|
| session_id | path | string | No |
Response 200 OK¶
application/json
Schema of the response body
{
"type": "object",
"additionalProperties": true,
"title": "Response Get Media Session Api V1 Media Sessions Session Id Get"
}
Response 422 Unprocessable Entity¶
application/json
This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.Schema of the response body
{
"properties": {
"detail": {
"items": {
"$ref": "#/components/schemas/ValidationError"
},
"type": "array",
"title": "Detail"
}
},
"type": "object",
"title": "HTTPValidationError"
}
Real-time Communication¶
GET /api/v1/realtime/status¶
Get Realtime Service Status
Description Get the current status of the realtime communication service.
Returns information about:
- Service availability and health
- Supported protocols and features
- Active connection counts
- WebSocket endpoint configurations
Response 200 OK¶
application/json
{
"status": "available",
"websocket_endpoints": {
"dashboard_relay": "/api/v1/realtime/dashboard/relay",
"conversation": "/api/v1/realtime/conversation"
},
"features": {
"dashboard_broadcasting": true,
"conversation_streaming": true,
"orchestrator_support": true,
"session_management": true
},
"active_connections": {
"dashboard_clients": 0,
"conversation_sessions": 0
},
"version": "v1"
}
Schema of the response body
{
"properties": {
"status": {
"type": "string",
"enum": [
"available",
"degraded",
"unavailable"
],
"title": "Status",
"description": "Current service status",
"example": "available"
},
"websocket_endpoints": {
"additionalProperties": {
"type": "string"
},
"type": "object",
"title": "Websocket Endpoints",
"description": "Available WebSocket endpoints",
"example": {
"conversation": "/api/v1/realtime/conversation",
"dashboard_relay": "/api/v1/realtime/dashboard/relay"
}
},
"features": {
"additionalProperties": {
"type": "boolean"
},
"type": "object",
"title": "Features",
"description": "Supported features and capabilities",
"example": {
"conversation_streaming": true,
"dashboard_broadcasting": true,
"orchestrator_support": true,
"session_management": true
}
},
"active_connections": {
"additionalProperties": {
"type": "integer"
},
"type": "object",
"title": "Active Connections",
"description": "Current active connection counts",
"example": {
"conversation_sessions": 0,
"dashboard_clients": 0
}
},
"protocols_supported": {
"items": {
"type": "string"
},
"type": "array",
"title": "Protocols Supported",
"description": "Supported communication protocols",
"default": [
"WebSocket"
],
"example": [
"WebSocket"
]
},
"version": {
"type": "string",
"title": "Version",
"description": "API version",
"default": "v1",
"example": "v1"
}
},
"type": "object",
"required": [
"status",
"websocket_endpoints",
"features",
"active_connections"
],
"title": "RealtimeStatusResponse",
"description": "Response schema for realtime service status endpoint.\n\nProvides comprehensive information about the realtime communication\nservice including availability, features, and active connections."
}
Schemas¶
AgentConfigUpdate¶
| Name | Type |
|---|---|
| model | |
| voice |
AgentModelUpdate¶
| Name | Type |
|---|---|
| deployment_id | |
| max_tokens | |
| temperature | |
| top_p |
AgentVoiceUpdate¶
| Name | Type |
|---|---|
| voice_name | |
| voice_style |
CallInitiateRequest¶
| Name | Type |
|---|---|
| caller_id | |
| context | |
| target_number | string |
CallInitiateResponse¶
| Name | Type |
|---|---|
| call_id | string |
| message | string |
| status | string |
| target_number | string |
CallListResponse¶
| Name | Type |
|---|---|
| calls | Array<CallStatusResponse> |
| limit | integer |
| page | integer |
| total | integer |
CallStatusResponse¶
| Name | Type |
|---|---|
| call_id | string |
| duration | |
| events | Array<> |
| participants | Array<> |
| status | string |
HealthResponse¶
| Name | Type |
|---|---|
| active_sessions | |
| details | Example: {'api_version': 'v1', 'service': 'rtagent-backend'} |
| message | string |
| session_metrics | |
| status | string |
| timestamp | number |
| version | string |
HTTPValidationError¶
| Name | Type |
|---|---|
| detail | Array<ValidationError> |
MediaSessionRequest¶
| Name | Type |
|---|---|
| audio_format | |
| call_connection_id | string |
| channels | |
| chunk_size | |
| enable_transcription | |
| enable_vad | |
| sample_rate |
MediaSessionResponse¶
| Name | Type |
|---|---|
| configuration | Example: {'channels': 1, 'chunk_size': 1024, 'format': 'pcm_16', 'sample_rate': 16000} |
| created_at | string |
| session_id | string |
| status | string |
| websocket_url | string |
ReadinessResponse¶
| Name | Type |
|---|---|
| checks | Array<ServiceCheck> |
| event_system | |
| response_time_ms | number |
| status | string |
| timestamp | number |
RealtimeStatusResponse¶
| Name | Type |
|---|---|
| active_connections | Example: {'conversation_sessions': 0, 'dashboard_clients': 0} |
| features | Example: {'conversation_streaming': True, 'dashboard_broadcasting': True, 'orchestrator_support': True, 'session_management': True} |
| protocols_supported | Array<string> |
| status | string |
| version | string |
| websocket_endpoints | Example: {'conversation': '/api/v1/realtime/conversation', 'dashboard_relay': '/api/v1/realtime/dashboard/relay'} |
ServiceCheck¶
| Name | Type |
|---|---|
| check_time_ms | number |
| component | string |
| details | |
| error | |
| status | string |
ValidationError¶
| Name | Type |
|---|---|
| loc | Array<> |
| msg | string |
| type | string |
WebSocket Endpoints¶
The following WebSocket endpoints provide real-time communication capabilities:
Media Streaming WebSocket¶
URL: wss://api.domain.com/api/v1/media/stream
Real-time bidirectional audio streaming for Azure Communication Services calls following ACS WebSocket protocol.
Query Parameters:
- call_connection_id (required): ACS call connection identifier
- session_id (optional): Browser session ID for UI coordination
Audio Formats: - MEDIA/TRANSCRIPTION Mode: PCM 16kHz mono (16-bit) - VOICE_LIVE Mode: PCM 24kHz mono (24-bit) for Azure OpenAI Realtime API
Message Types:
// Incoming audio data
{
"kind": "AudioData",
"audioData": {
"timestamp": "2025-09-28T12:00:00Z",
"participantRawID": "8:acs:...",
"data": "base64EncodedPCMAudio",
"silent": false
}
}
// Outgoing audio data (bidirectional streaming)
{
"Kind": "AudioData",
"AudioData": {
"Data": "base64EncodedPCMAudio"
}
}
Realtime Conversation WebSocket¶
URL: wss://api.domain.com/api/v1/realtime/conversation
Browser-based voice conversations with session persistence and real-time transcription.
Query Parameters:
- session_id (optional): Conversation session identifier for session restoration
Features: - Real-time speech-to-text transcription - TTS audio streaming for responses - Conversation context persistence - Multi-language support
Dashboard Relay WebSocket¶
URL: wss://api.domain.com/api/v1/realtime/dashboard/relay
Real-time updates for dashboard clients monitoring ongoing conversations.
Query Parameters:
- session_id (optional): Filter updates for specific conversation sessions
Use Cases: - Live call monitoring and analytics - Real-time transcript viewing - Agent performance dashboards
Authentication & Security¶
All endpoints support Azure Entra ID authentication using DefaultAzureCredential following Azure best practices.
Authentication Methods¶
Environment Variables (Recommended for production):
# Service Principal Authentication
export AZURE_CLIENT_ID="your-client-id"
export AZURE_CLIENT_SECRET="your-client-secret"
export AZURE_TENANT_ID="your-tenant-id"
Azure CLI (Development):
Managed Identity (Azure deployment): - System-assigned or user-assigned managed identity - No credential management required - Automatic token refresh
Required RBAC Roles¶
Grant these Azure roles to your service principal or managed identity:
| Service | Required Role | Purpose |
|---|---|---|
| Azure Speech Services | Cognitive Services User | STT/TTS operations |
| Azure Cache for Redis | Redis Cache Contributor | Session state management |
| Azure Communication Services | Communication Services Contributor | Call automation and media streaming |
| Azure Storage | Storage Blob Data Contributor | Call recordings and artifacts |
| Azure OpenAI | Cognitive Services OpenAI User | AI model inference |
Security Features¶
- Credential-less authentication with managed identity
- Connection pooling with automatic token refresh
- TLS encryption for all HTTP/WebSocket connections
- Input validation and request sanitization
- Rate limiting per Azure service quotas
Configuration¶
Required Environment Variables¶
Azure Services Configuration:
# Azure Speech Services
AZURE_SPEECH_REGION=eastus
AZURE_SPEECH_RESOURCE_ID=/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{name}
# Azure Cache for Redis
AZURE_REDIS_HOSTNAME=your-redis.redis.cache.windows.net
AZURE_REDIS_USERNAME=default
# Azure Communication Services
ACS_ENDPOINT=https://your-acs.communication.azure.com
Application Configuration:
# Streaming Mode (affects audio processing pipeline)
ACS_STREAMING_MODE=MEDIA # MEDIA | VOICE_LIVE | TRANSCRIPTION
# Optional Settings
AZURE_OPENAI_ENDPOINT=https://your-openai.openai.azure.com # For AI features
AZURE_STORAGE_CONNECTION_STRING=... # For call recordings
Streaming Mode Configuration¶
Controls the audio processing pipeline and determines handler selection:
| Mode | Description | Audio Format | Use Case |
|---|---|---|---|
MEDIA |
Default STT/TTS pipeline | PCM 16kHz mono | Traditional phone calls with AI orchestration |
VOICE_LIVE |
Azure OpenAI Realtime API | PCM 24kHz mono | Advanced conversational AI |
TRANSCRIPTION |
Real-time transcription only | PCM 16kHz mono | Call recording and analysis |
📖 Reference: Complete streaming modes documentation
Performance Tuning¶
Connection Pools (optional):
# Speech service connection limits
MAX_STT_POOL_SIZE=4
MAX_TTS_POOL_SIZE=4
# Redis connection pool
REDIS_MAX_CONNECTIONS=20
REDIS_CONNECTION_TIMEOUT=5
Audio Processing:
# Voice Activity Detection (VAD) settings
VAD_TIMEOUT_MS=2000 # Silence timeout
VAD_SENSITIVITY=medium # low | medium | high
# Barge-in detection
BARGE_IN_ENABLED=true
BARGE_IN_THRESHOLD_MS=10 # Response time for interruption
Error Handling¶
Standard Error Response Format¶
All endpoints return consistent error responses following RFC 7807:
{
"detail": "Human-readable error description",
"status_code": 400,
"timestamp": "2025-09-28T12:00:00Z",
"type": "validation_error",
"instance": "/api/v1/calls/initiate",
"errors": [
{
"field": "phone_number",
"message": "Invalid phone number format",
"code": "format_invalid"
}
]
}
HTTP Status Codes¶
| Status | Description | Common Causes |
|---|---|---|
| 200 | Success | Request completed successfully |
| 202 | Accepted | Async operation initiated |
| 400 | Bad Request | Invalid request format or parameters |
| 401 | Unauthorized | Missing or invalid authentication |
| 403 | Forbidden | Insufficient permissions or RBAC roles |
| 404 | Not Found | Resource not found |
| 422 | Validation Error | Request body schema validation failed |
| 429 | Rate Limited | Azure service quota exceeded |
| 500 | Internal Server Error | Unexpected server error |
| 502 | Bad Gateway | Azure service unavailable |
| 503 | Service Unavailable | Dependencies not ready |
| 504 | Gateway Timeout | Azure service timeout |
Service-Specific Errors¶
Azure Speech Services:
- speech_quota_exceeded - API rate limit reached
- speech_region_unavailable - Speech service region down
- audio_format_unsupported - Invalid audio format specified
Azure Communication Services:
- call_not_found - Call connection ID invalid
- media_streaming_failed - WebSocket streaming error
- pstn_number_invalid - Phone number format error
Azure Cache for Redis:
- redis_connection_failed - Redis cluster unavailable
- session_expired - Session data TTL exceeded
Retry Strategy¶
The API implements exponential backoff for transient errors:
# Retry configuration
RETRY_MAX_ATTEMPTS=3
RETRY_BACKOFF_FACTOR=2.0
RETRY_JITTER=true
# Service-specific timeouts
SPEECH_REQUEST_TIMEOUT=30
ACS_CALL_TIMEOUT=60
REDIS_OPERATION_TIMEOUT=5
📖 Reference: Azure Service reliability patterns
Getting Started¶
Quick Setup¶
-
Configure Authentication:
-
Set Required Environment Variables:
-
Test Health Endpoint:
-
Initiate a Test Call:
Development Resources¶
- Interactive API Explorer - Test all endpoints directly in browser
- WebSocket Testing - WebSocket connection examples
- Authentication Setup - Detailed auth configuration
- Architecture Overview - System design and deployment patterns
Production Considerations¶
- Use managed identity authentication in Azure deployments
- Configure connection pooling for high-throughput scenarios
- Enable distributed tracing with Azure Monitor integration
- Implement health checks for all dependent services
- Set up monitoring and alerting for service reliability
📖 Reference: Production deployment guide