API Reference

API Reference¶

Interactive API documentation generated from the OpenAPI schema. This provides the definitive reference for all REST endpoints, WebSocket connections, authentication, and configuration.

Interactive Documentation¶

Real-Time Voice Agent API 1.0.0¶

Real-Time Voice Agent API

Contact: Real-Time Voice Agent Team support@example.com

License: MIT License

health¶

GET /api/v1/health¶

Basic Health Check

Description Basic health check endpoint that returns 200 if the server is running. Used by load balancers for liveness checks.

Response 200 OK¶

application/json

{
    "status": "healthy",
    "version": "1.0.0",
    "timestamp": 1691668800.0,
    "message": "Real-Time Audio Agent API v1 is running",
    "details": {
        "api_version": "v1",
        "service": "rtagent-backend"
    }
}

Schema of the response body

{
    "properties": {
        "status": {
            "type": "string",
            "title": "Status",
            "description": "Overall health status",
            "example": "healthy"
        },
        "version": {
            "type": "string",
            "title": "Version",
            "description": "API version",
            "default": "1.0.0",
            "example": "1.0.0"
        },
        "timestamp": {
            "type": "number",
            "title": "Timestamp",
            "description": "Timestamp when check was performed",
            "example": 1691668800.0
        },
        "message": {
            "type": "string",
            "title": "Message",
            "description": "Human-readable status message",
            "example": "Real-Time Audio Agent API v1 is running"
        },
        "details": {
            "additionalProperties": true,
            "type": "object",
            "title": "Details",
            "description": "Additional health details",
            "example": {
                "api_version": "v1",
                "service": "rtagent-backend"
            }
        },
        "active_sessions": {
            "anyOf": [
                {
                    "type": "integer"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Active Sessions",
            "description": "Current number of active realtime conversation sessions (None if unavailable)",
            "example": 3
        },
        "session_metrics": {
            "anyOf": [
                {
                    "additionalProperties": true,
                    "type": "object"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Session Metrics",
            "description": "Optional granular session metrics (connected/disconnected, etc.)",
            "example": {
                "active": 3,
                "connected": 5,
                "disconnected": 2
            }
        }
    },
    "type": "object",
    "required": [
        "status",
        "timestamp",
        "message"
    ],
    "title": "HealthResponse",
    "description": "Health check response model.",
    "example": {
        "active_sessions": 3,
        "details": {
            "api_version": "v1",
            "service": "rtagent-backend"
        },
        "message": "Real-Time Audio Agent API v1 is running",
        "session_metrics": {
            "active": 3,
            "connected": 5,
            "disconnected": 2
        },
        "status": "healthy",
        "timestamp": 1691668800.0,
        "version": "1.0.0"
    }
}

GET /api/v1/readiness¶

Comprehensive Readiness Check

Description Comprehensive readiness probe that checks all critical dependencies with timeouts.

This endpoint verifies:
- Redis connectivity and performance
- Azure OpenAI client health
- Speech services (TTS/STT) availability
- ACS caller configuration and connectivity
- RT Agents initialization
- Authentication configuration (when ENABLE_AUTH_VALIDATION=True)
- Event system health

When authentication validation is enabled, checks:
- BACKEND_AUTH_CLIENT_ID is set and is a valid GUID
- AZURE_TENANT_ID is set and is a valid GUID
- ALLOWED_CLIENT_IDS contains at least one valid GUID

Returns 503 if any critical services are unhealthy, 200 if all systems are

ready.

Response 200 OK¶

application/json

{
    "status": "ready",
    "timestamp": 1691668800.0,
    "response_time_ms": 45.2,
    "checks": [
        {
            "component": "redis",
            "status": "healthy",
            "check_time_ms": 12.5,
            "details": "Connected to Redis successfully"
        },
        {
            "component": "auth_configuration",
            "status": "healthy",
            "check_time_ms": 1.2,
            "details": "Auth validation enabled with 2 allowed client(s)"
        }
    ],
    "event_system": {
        "is_healthy": true,
        "handlers_count": 7,
        "domains_count": 2
    }
}

Schema of the response body

{
    "properties": {
        "status": {
            "type": "string",
            "enum": [
                "ready",
                "not_ready",
                "degraded"
            ],
            "title": "Status",
            "description": "Overall readiness status",
            "example": "ready"
        },
        "timestamp": {
            "type": "number",
            "title": "Timestamp",
            "description": "Timestamp when check was performed",
            "example": 1691668800.0
        },
        "response_time_ms": {
            "type": "number",
            "title": "Response Time Ms",
            "description": "Total time taken for all checks in milliseconds",
            "example": 45.2
        },
        "checks": {
            "items": {
                "$ref": "#/components/schemas/ServiceCheck"
            },
            "type": "array",
            "title": "Checks",
            "description": "Individual component health checks"
        },
        "event_system": {
            "anyOf": [
                {
                    "additionalProperties": true,
                    "type": "object"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Event System",
            "description": "Event system status information",
            "example": {
                "domains_count": 2,
                "handlers_count": 7,
                "is_healthy": true
            }
        }
    },
    "type": "object",
    "required": [
        "status",
        "timestamp",
        "response_time_ms",
        "checks"
    ],
    "title": "ReadinessResponse",
    "description": "Comprehensive readiness check response model.",
    "example": {
        "checks": [
            {
                "check_time_ms": 12.5,
                "component": "redis",
                "details": "Connected to Redis successfully",
                "status": "healthy"
            },
            {
                "check_time_ms": 8.3,
                "component": "azure_openai",
                "details": "Client initialized",
                "status": "healthy"
            }
        ],
        "event_system": {
            "domains_count": 2,
            "handlers_count": 7,
            "is_healthy": true
        },
        "response_time_ms": 45.2,
        "status": "ready",
        "timestamp": 1691668800.0
    }
}

Response 503 Service Unavailable¶

application/json

{
    "status": "not_ready",
    "timestamp": 1691668800.0,
    "response_time_ms": 1250.0,
    "checks": [
        {
            "component": "redis",
            "status": "unhealthy",
            "check_time_ms": 1000.0,
            "error": "Connection timeout"
        },
        {
            "component": "auth_configuration",
            "status": "unhealthy",
            "check_time_ms": 2.1,
            "error": "BACKEND_AUTH_CLIENT_ID is not a valid GUID"
        }
    ]
}

Schema of the response body

GET /api/v1/agents¶

Get Agents Info

Description Get information about loaded RT agents including their configuration, model settings, and voice settings that can be modified.

Response 200 OK¶

application/json

Schema of the response body

PUT /api/v1/agents/{agent_name}¶

Update Agent Config

Description Update configuration for a specific agent (model settings, voice, etc.). Changes are applied to the runtime instance but not persisted to YAML files.

Input parameters

Parameter	In	Type	Default	Nullable	Description
agent_name	path	string		No

Request body

application/json

{
    "model": null,
    "voice": null
}

This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the request body

{
    "properties": {
        "model": {
            "anyOf": [
                {
                    "$ref": "#/components/schemas/AgentModelUpdate"
                },
                {
                    "type": "null"
                }
            ]
        },
        "voice": {
            "anyOf": [
                {
                    "$ref": "#/components/schemas/AgentVoiceUpdate"
                },
                {
                    "type": "null"
                }
            ]
        }
    },
    "type": "object",
    "title": "AgentConfigUpdate"
}

Response 200 OK¶

application/json

Schema of the response body

Response 422 Unprocessable Entity¶

application/json

{
    "detail": [
        {
            "loc": [
                null
            ],
            "msg": "string",
            "type": "string"
        }
    ]
}

This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body

{
    "properties": {
        "detail": {
            "items": {
                "$ref": "#/components/schemas/ValidationError"
            },
            "type": "array",
            "title": "Detail"
        }
    },
    "type": "object",
    "title": "HTTPValidationError"
}

Call Management¶

POST /api/v1/calls/initiate¶

Initiate Outbound Call

Description Initiate a new outbound call to the specified phone number.

This endpoint:
- Validates the phone number format
- Generates a unique call ID
- Emits a call initiation event through the V1 event system
- Returns immediately with call status

The actual call establishment is handled asynchronously through Azure

Communication Services.

Request body

application/json

{
    "caller_id": "+1987654321",
    "context": {
        "customer_id": "cust_12345",
        "department": "support"
    },
    "target_number": "+1234567890"
}

This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the request body

{
    "properties": {
        "target_number": {
            "type": "string",
            "pattern": "^\\+[1-9]\\d{1,14}$",
            "title": "Target Number",
            "description": "Phone number to call in E.164 format (e.g., +1234567890)",
            "example": "+1234567890"
        },
        "caller_id": {
            "anyOf": [
                {
                    "type": "string"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Caller Id",
            "description": "Caller ID to display (optional, uses system default if not provided)",
            "example": "+1987654321"
        },
        "context": {
            "anyOf": [
                {
                    "additionalProperties": true,
                    "type": "object"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Context",
            "description": "Additional call context metadata",
            "example": {
                "customer_id": "cust_12345",
                "department": "support",
                "priority": "high",
                "source": "web_portal"
            }
        }
    },
    "type": "object",
    "required": [
        "target_number"
    ],
    "title": "CallInitiateRequest",
    "description": "Request model for initiating a call.",
    "example": {
        "caller_id": "+1987654321",
        "context": {
            "customer_id": "cust_12345",
            "department": "support"
        },
        "target_number": "+1234567890"
    }
}

Response 200 OK¶

application/json

{
    "call_id": "call_abc12345",
    "status": "initiating",
    "target_number": "+1234567890",
    "message": "Call initiation requested for +1234567890"
}

Schema of the response body

{
    "properties": {
        "call_id": {
            "type": "string",
            "title": "Call Id",
            "description": "Unique call identifier",
            "example": "call_abc12345"
        },
        "status": {
            "type": "string",
            "title": "Status",
            "description": "Current call status",
            "example": "initiating"
        },
        "target_number": {
            "type": "string",
            "title": "Target Number",
            "description": "Target phone number",
            "example": "+1234567890"
        },
        "message": {
            "type": "string",
            "title": "Message",
            "description": "Human-readable status message",
            "example": "Call initiation requested"
        }
    },
    "type": "object",
    "required": [
        "call_id",
        "status",
        "target_number",
        "message"
    ],
    "title": "CallInitiateResponse",
    "description": "Response model for call initiation.",
    "example": {
        "call_id": "call_abc12345",
        "message": "Call initiation requested for +1234567890",
        "status": "initiating",
        "target_number": "+1234567890"
    }
}

Response 400 Bad Request¶

application/json

{
    "detail": "Invalid phone number format. Must be in E.164 format (e.g., +1234567890)"
}

Schema of the response body

Response 500 Internal Server Error¶

application/json

{
    "detail": "Failed to initiate call: Azure Communication Service unavailable"
}

Schema of the response body

Response 422 Unprocessable Entity¶

application/json

{
    "detail": [
        {
            "loc": [
                null
            ],
            "msg": "string",
            "type": "string"
        }
    ]
}

This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body

{
    "properties": {
        "detail": {
            "items": {
                "$ref": "#/components/schemas/ValidationError"
            },
            "type": "array",
            "title": "Detail"
        }
    },
    "type": "object",
    "title": "HTTPValidationError"
}

GET /api/v1/calls/¶

List Calls

Description Retrieve a paginated list of calls with optional filtering.

Supports:
- Pagination with page and limit parameters
- Filtering by call status
- Sorting by creation time (newest first)

Input parameters

Parameter	In	Type	Default	Nullable	Description
limit	query	integer	10	No	Number of items per page (1-100)
page	query	integer	1	No	Page number (1-based)
status_filter	query	None		No	Filter calls by status

Response 200 OK¶

application/json

{
    "calls": [
        {
            "call_id": "call_abc12345",
            "status": "connected",
            "duration": 120,
            "participants": [],
            "events": []
        }
    ],
    "total": 25,
    "page": 1,
    "limit": 10
}

Schema of the response body

{
    "properties": {
        "calls": {
            "items": {
                "$ref": "#/components/schemas/CallStatusResponse"
            },
            "type": "array",
            "title": "Calls",
            "description": "List of calls"
        },
        "total": {
            "type": "integer",
            "title": "Total",
            "description": "Total number of calls matching criteria",
            "example": 25
        },
        "page": {
            "type": "integer",
            "title": "Page",
            "description": "Current page number (1-based)",
            "default": 1,
            "example": 1
        },
        "limit": {
            "type": "integer",
            "title": "Limit",
            "description": "Number of items per page",
            "default": 10,
            "example": 10
        }
    },
    "type": "object",
    "required": [
        "calls",
        "total"
    ],
    "title": "CallListResponse",
    "description": "Response model for listing calls.",
    "example": {
        "calls": [
            {
                "call_id": "call_abc12345",
                "duration": 120,
                "events": [],
                "participants": [],
                "status": "connected"
            }
        ],
        "limit": 10,
        "page": 1,
        "total": 25
    }
}

Response 400 Bad Request¶

application/json

{
    "detail": "Page number must be positive"
}

Schema of the response body

Response 422 Unprocessable Entity¶

application/json

{
    "detail": [
        {
            "loc": [
                null
            ],
            "msg": "string",
            "type": "string"
        }
    ]
}

This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body

{
    "properties": {
        "detail": {
            "items": {
                "$ref": "#/components/schemas/ValidationError"
            },
            "type": "array",
            "title": "Detail"
        }
    },
    "type": "object",
    "title": "HTTPValidationError"
}

POST /api/v1/calls/answer¶

Answer Inbound Call

Description Handle inbound call events and Event Grid subscription validation.

This endpoint:
- Validates Event Grid subscription requests
- Answers incoming calls automatically with orchestrator selection
- Initializes conversation state with features
- Supports pluggable conversation orchestrators
- Provides advanced tracing and monitoring

Enhanced V1 features:
- Pluggable orchestrator injection for conversation handling
- Enhanced state management with orchestrator metadata
- Advanced observability and correlation
- Production-ready error handling

Response 200 OK¶

application/json

{
    "status": "call answered",
    "orchestrator": "gpt_flow",
    "acs_features": {
        "orchestrator_support": true,
        "advanced_tracing": true,
        "api_version": "v1"
    }
}

Schema of the response body

Response 400 Bad Request¶

application/json

{
    "detail": "Invalid Event Grid request format"
}

Schema of the response body

Response 503 Service Unavailable¶

application/json

{
    "detail": "ACS not initialised"
}

Schema of the response body

POST /api/v1/calls/callbacks¶

Handle ACS Callback Events

Description Handle Azure Communication Services callback events.

This endpoint receives webhooks from ACS when call events occur:
- Call connected/disconnected
- Participant joined/left
- Media events (DTMF tones, play completed, etc.)
- Transfer events

The endpoint validates authentication, processes events through the
V1 CallEventProcessor system, and returns processing results.

Response 200 OK¶

application/json

{
    "status": "success",
    "processed_events": 1,
    "call_connection_id": "abc123"
}

Schema of the response body

Response 500 Internal Server Error¶

application/json

{
    "error": "Failed to process callback events"
}

Schema of the response body

Response 503 Service Unavailable¶

application/json

{
    "error": "ACS not initialised"
}

Schema of the response body

ACS Media Session¶

GET /api/v1/media/status¶

Get Media Streaming Status

Description Get the current status of media streaming configuration.

:return: Current media streaming configuration and status :rtype: dict

Response 200 OK¶

application/json

Schema of the response body

{
    "additionalProperties": true,
    "type": "object",
    "title": "Response Get Media Status Api V1 Media Status Get"
}

POST /api/v1/media/sessions¶

Create Media Session

Description Create a new media streaming session for Azure Communication Services.

Initializes a media session with specified audio configuration and returns WebSocket connection details for real-time audio streaming. This endpoint prepares the infrastructure for bidirectional media communication with configurable audio parameters.

Args: request: Media session configuration including call connection ID, audio format, sample rate, and streaming options.

Returns: MediaSessionResponse: Session details containing unique session ID, WebSocket URL for streaming, status, and audio configuration.

Raises: HTTPException: When session creation fails due to invalid configuration or system resource constraints.

Example: >>> request = MediaSessionRequest(call_connection_id="call_123") >>> response = await create_media_session(request) >>> print(response.websocket_url)

Request body

application/json

{
    "audio_format": "pcm_16",
    "call_connection_id": "call_12345",
    "channels": 1,
    "chunk_size": 1024,
    "enable_transcription": true,
    "enable_vad": true,
    "sample_rate": 16000
}

This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the request body

{
    "properties": {
        "call_connection_id": {
            "type": "string",
            "title": "Call Connection Id",
            "description": "ACS call connection identifier",
            "example": "call_12345"
        },
        "sample_rate": {
            "anyOf": [
                {
                    "type": "integer"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Sample Rate",
            "description": "Audio sample rate in Hz",
            "default": 16000,
            "example": 16000
        },
        "channels": {
            "anyOf": [
                {
                    "type": "integer"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Channels",
            "description": "Number of audio channels",
            "default": 1,
            "example": 1
        },
        "audio_format": {
            "anyOf": [
                {
                    "type": "string"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Audio Format",
            "description": "Audio format (pcm_16, pcm_24, opus, etc.)",
            "default": "pcm_16",
            "example": "pcm_16"
        },
        "chunk_size": {
            "anyOf": [
                {
                    "type": "integer"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Chunk Size",
            "description": "Audio chunk size in bytes",
            "default": 1024,
            "example": 1024
        },
        "enable_transcription": {
            "anyOf": [
                {
                    "type": "boolean"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Enable Transcription",
            "description": "Enable real-time transcription",
            "default": true,
            "example": true
        },
        "enable_vad": {
            "anyOf": [
                {
                    "type": "boolean"
                },
                {
                    "type": "null"
                }
            ],
            "title": "Enable Vad",
            "description": "Enable voice activity detection",
            "default": true,
            "example": true
        }
    },
    "type": "object",
    "required": [
        "call_connection_id"
    ],
    "title": "MediaSessionRequest",
    "description": "Request schema for starting a media session.",
    "example": {
        "audio_format": "pcm_16",
        "call_connection_id": "call_12345",
        "channels": 1,
        "chunk_size": 1024,
        "enable_transcription": true,
        "enable_vad": true,
        "sample_rate": 16000
    }
}

Response 200 OK¶

application/json

{
    "configuration": {
        "channels": 1,
        "chunk_size": 1024,
        "format": "pcm_16",
        "sample_rate": 16000
    },
    "created_at": "2025-08-10T13:45:00Z",
    "session_id": "media_session_123456",
    "status": "active",
    "websocket_url": "wss://api.example.com/v1/media/stream/media_session_123456"
}

This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body

{
    "properties": {
        "session_id": {
            "type": "string",
            "title": "Session Id",
            "description": "Unique media session identifier",
            "example": "media_session_123456"
        },
        "websocket_url": {
            "type": "string",
            "title": "Websocket Url",
            "description": "WebSocket URL for audio streaming",
            "example": "wss://api.example.com/v1/media/stream/media_session_123456"
        },
        "status": {
            "type": "string",
            "title": "Status",
            "description": "Session status",
            "example": "active"
        },
        "created_at": {
            "type": "string",
            "title": "Created At",
            "description": "Session creation timestamp",
            "example": "2025-08-10T13:45:00Z"
        },
        "configuration": {
            "additionalProperties": true,
            "type": "object",
            "title": "Configuration",
            "description": "Session configuration settings",
            "example": {
                "channels": 1,
                "chunk_size": 1024,
                "format": "pcm_16",
                "sample_rate": 16000
            }
        }
    },
    "type": "object",
    "required": [
        "session_id",
        "websocket_url",
        "status",
        "created_at",
        "configuration"
    ],
    "title": "MediaSessionResponse",
    "description": "Response schema for media session creation.",
    "example": {
        "configuration": {
            "channels": 1,
            "chunk_size": 1024,
            "format": "pcm_16",
            "sample_rate": 16000
        },
        "created_at": "2025-08-10T13:45:00Z",
        "session_id": "media_session_123456",
        "status": "active",
        "websocket_url": "wss://api.example.com/v1/media/stream/media_session_123456"
    }
}

Response 422 Unprocessable Entity¶

application/json

{
    "detail": [
        {
            "loc": [
                null
            ],
            "msg": "string",
            "type": "string"
        }
    ]
}

This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body

{
    "properties": {
        "detail": {
            "items": {
                "$ref": "#/components/schemas/ValidationError"
            },
            "type": "array",
            "title": "Detail"
        }
    },
    "type": "object",
    "title": "HTTPValidationError"
}

GET /api/v1/media/sessions/{session_id}¶

Get Media Session Status

Description Retrieve status and metadata for a specific media session.

Queries the current state of an active media session including connection status, WebSocket state, and session configuration details. Used for monitoring and debugging media streaming sessions.

Args: session_id: Unique identifier for the media session to query.

Returns: dict: Session information including status, connection state, creation timestamp, and API version details.

Example: >>> session_info = await get_media_session("media_session_123") >>> print(session_info["status"])

Input parameters

Parameter	In	Type	Default	Nullable	Description
session_id	path	string		No

Response 200 OK¶

application/json

Schema of the response body

{
    "type": "object",
    "additionalProperties": true,
    "title": "Response Get Media Session Api V1 Media Sessions  Session Id  Get"
}

Response 422 Unprocessable Entity¶

application/json

{
    "detail": [
        {
            "loc": [
                null
            ],
            "msg": "string",
            "type": "string"
        }
    ]
}

This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body

{
    "properties": {
        "detail": {
            "items": {
                "$ref": "#/components/schemas/ValidationError"
            },
            "type": "array",
            "title": "Detail"
        }
    },
    "type": "object",
    "title": "HTTPValidationError"
}

Real-time Communication¶

GET /api/v1/realtime/status¶

Get Realtime Service Status

Description Get the current status of the realtime communication service.

Returns information about:
- Service availability and health
- Supported protocols and features
- Active connection counts
- WebSocket endpoint configurations

Response 200 OK¶

application/json

{
    "status": "available",
    "websocket_endpoints": {
        "dashboard_relay": "/api/v1/realtime/dashboard/relay",
        "conversation": "/api/v1/realtime/conversation"
    },
    "features": {
        "dashboard_broadcasting": true,
        "conversation_streaming": true,
        "orchestrator_support": true,
        "session_management": true
    },
    "active_connections": {
        "dashboard_clients": 0,
        "conversation_sessions": 0
    },
    "version": "v1"
}

Schema of the response body

{
    "properties": {
        "status": {
            "type": "string",
            "enum": [
                "available",
                "degraded",
                "unavailable"
            ],
            "title": "Status",
            "description": "Current service status",
            "example": "available"
        },
        "websocket_endpoints": {
            "additionalProperties": {
                "type": "string"
            },
            "type": "object",
            "title": "Websocket Endpoints",
            "description": "Available WebSocket endpoints",
            "example": {
                "conversation": "/api/v1/realtime/conversation",
                "dashboard_relay": "/api/v1/realtime/dashboard/relay"
            }
        },
        "features": {
            "additionalProperties": {
                "type": "boolean"
            },
            "type": "object",
            "title": "Features",
            "description": "Supported features and capabilities",
            "example": {
                "conversation_streaming": true,
                "dashboard_broadcasting": true,
                "orchestrator_support": true,
                "session_management": true
            }
        },
        "active_connections": {
            "additionalProperties": {
                "type": "integer"
            },
            "type": "object",
            "title": "Active Connections",
            "description": "Current active connection counts",
            "example": {
                "conversation_sessions": 0,
                "dashboard_clients": 0
            }
        },
        "protocols_supported": {
            "items": {
                "type": "string"
            },
            "type": "array",
            "title": "Protocols Supported",
            "description": "Supported communication protocols",
            "default": [
                "WebSocket"
            ],
            "example": [
                "WebSocket"
            ]
        },
        "version": {
            "type": "string",
            "title": "Version",
            "description": "API version",
            "default": "v1",
            "example": "v1"
        }
    },
    "type": "object",
    "required": [
        "status",
        "websocket_endpoints",
        "features",
        "active_connections"
    ],
    "title": "RealtimeStatusResponse",
    "description": "Response schema for realtime service status endpoint.\n\nProvides comprehensive information about the realtime communication\nservice including availability, features, and active connections."
}

Schemas¶

AgentConfigUpdate¶

Name	Type
model
voice

AgentModelUpdate¶

Name	Type
deployment_id
max_tokens
temperature
top_p

AgentVoiceUpdate¶

Name	Type
voice_name
voice_style

CallInitiateRequest¶

Name	Type
caller_id
context
target_number	string

CallInitiateResponse¶

Name	Type
call_id	string
message	string
status	string
target_number	string

CallListResponse¶

Name	Type
calls	Array<CallStatusResponse>
limit	integer
page	integer
total	integer

CallStatusResponse¶

Name	Type
call_id	string
duration
events	Array<>
participants	Array<>
status	string

HealthResponse¶

Name	Type
active_sessions
details	Example: `{'api_version': 'v1', 'service': 'rtagent-backend'}`
message	string
session_metrics
status	string
timestamp	number
version	string

HTTPValidationError¶

Name	Type
detail	Array<ValidationError>

MediaSessionRequest¶

Name	Type
audio_format
call_connection_id	string
channels
chunk_size
enable_transcription
enable_vad
sample_rate

MediaSessionResponse¶

Name	Type
configuration	Example: `{'channels': 1, 'chunk_size': 1024, 'format': 'pcm_16', 'sample_rate': 16000}`
created_at	string
session_id	string
status	string
websocket_url	string

ReadinessResponse¶

Name	Type
checks	Array<ServiceCheck>
event_system
response_time_ms	number
status	string
timestamp	number

RealtimeStatusResponse¶

Name	Type
active_connections	Example: `{'conversation_sessions': 0, 'dashboard_clients': 0}`
features	Example: `{'conversation_streaming': True, 'dashboard_broadcasting': True, 'orchestrator_support': True, 'session_management': True}`
protocols_supported	Array<string>
status	string
version	string
websocket_endpoints	Example: `{'conversation': '/api/v1/realtime/conversation', 'dashboard_relay': '/api/v1/realtime/dashboard/relay'}`

ServiceCheck¶

Name	Type
check_time_ms	number
component	string
details
error
status	string

ValidationError¶

Name	Type
loc	Array<>
msg	string
type	string

WebSocket Endpoints¶

The following WebSocket endpoints provide real-time communication capabilities:

Media Streaming WebSocket¶

URL: wss://api.domain.com/api/v1/media/stream

Real-time bidirectional audio streaming for Azure Communication Services calls following ACS WebSocket protocol.

Query Parameters: - call_connection_id (required): ACS call connection identifier - session_id (optional): Browser session ID for UI coordination

Audio Formats: - MEDIA/TRANSCRIPTION Mode: PCM 16kHz mono (16-bit) - VOICE_LIVE Mode: PCM 24kHz mono (24-bit) for Azure OpenAI Realtime API

Message Types:

// Incoming audio data from ACS
{
  "kind": "AudioData",
  "audioData": {
    "timestamp": "2025-09-28T12:00:00Z",
    "participantRawID": "8:acs:...",
    "data": "base64EncodedPCMAudio",
    "silent": false
  }
}

// Outgoing audio data (bidirectional streaming)
{
  "Kind": "AudioData",
  "AudioData": {
    "Data": "base64EncodedPCMAudio"
  }
}

Browser Conversation WebSocket¶

URL: wss://api.domain.com/api/v1/browser/conversation

Browser-based voice conversations with session persistence and real-time transcription.

Query Parameters: - session_id (optional): Session identifier for restoration - streaming_mode (optional): VOICE_LIVE or REALTIME (defaults to REALTIME) - user_email (optional): User email for session context

Features: - Real-time speech-to-text transcription - TTS audio streaming for responses - Barge-in detection and handling - Conversation context persistence - Multi-language support

Message Types:

// Binary: Raw PCM audio frames (16kHz or 24kHz depending on mode)

// Text: Control messages
{
  "kind": "StopAudio"  // Signal audio buffer commit
}

Dashboard Relay WebSocket¶

URL: wss://api.domain.com/api/v1/browser/dashboard/relay

Real-time updates for dashboard clients monitoring ongoing conversations.

Query Parameters: - session_id (optional): Filter updates for specific conversation sessions

Use Cases: - Live call monitoring and analytics - Real-time transcript viewing - Agent performance dashboards - Connection status monitoring

Authentication & Security¶

All endpoints support Azure Entra ID authentication using DefaultAzureCredential following Azure best practices.

Authentication Methods¶

Environment Variables (Recommended for production):

# Service Principal Authentication
export AZURE_CLIENT_ID="your-client-id"
export AZURE_CLIENT_SECRET="your-client-secret" 
export AZURE_TENANT_ID="your-tenant-id"

Azure CLI (Development):

az login

Managed Identity (Azure deployment): - System-assigned or user-assigned managed identity - No credential management required - Automatic token refresh

Required RBAC Roles¶

Grant these Azure roles to your service principal or managed identity:

Service	Required Role	Purpose
Azure Speech Services	Cognitive Services User	STT/TTS operations
Azure Cache for Redis	Redis Cache Contributor	Session state management
Azure Communication Services	Communication Services Contributor	Call automation and media streaming
Azure Storage	Storage Blob Data Contributor	Call recordings and artifacts
Azure OpenAI	Cognitive Services OpenAI User	AI model inference

Security Features¶

Credential-less authentication with managed identity
Connection pooling with automatic token refresh
TLS encryption for all HTTP/WebSocket connections
Input validation and request sanitization
Rate limiting per Azure service quotas

Configuration¶

Required Environment Variables¶

Azure Services Configuration:

# Azure Speech Services
AZURE_SPEECH_REGION=eastus
AZURE_SPEECH_RESOURCE_ID=/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{name}

# Azure Cache for Redis
AZURE_REDIS_HOSTNAME=your-redis.redis.cache.windows.net
AZURE_REDIS_USERNAME=default

# Azure Communication Services
ACS_ENDPOINT=https://your-acs.communication.azure.com

Application Configuration:

# Streaming Mode (affects audio processing pipeline)
ACS_STREAMING_MODE=MEDIA  # MEDIA | VOICE_LIVE | TRANSCRIPTION

# Optional Settings
AZURE_OPENAI_ENDPOINT=https://your-openai.openai.azure.com  # For AI features
AZURE_STORAGE_CONNECTION_STRING=...  # For call recordings

Streaming Mode Configuration¶

Controls the audio processing pipeline and determines handler selection:

Mode	Description	Audio Format	Use Case
`MEDIA`	Default STT/TTS pipeline	PCM 16kHz mono	Traditional phone calls with AI orchestration
`VOICE_LIVE`	Azure OpenAI Realtime API	PCM 24kHz mono	Advanced conversational AI
`TRANSCRIPTION`	Real-time transcription only	PCM 16kHz mono	Call recording and analysis

📖 Reference: Streaming modes documentation

Performance Tuning¶

Connection Pools (optional):

# Speech service connection limits
MAX_STT_POOL_SIZE=4
MAX_TTS_POOL_SIZE=4

# Redis connection pool
REDIS_MAX_CONNECTIONS=20
REDIS_CONNECTION_TIMEOUT=5

Audio Processing:

# Voice Activity Detection (VAD) settings
VAD_TIMEOUT_MS=2000  # Silence timeout
VAD_SENSITIVITY=medium  # low | medium | high

# Barge-in detection
BARGE_IN_ENABLED=true
BARGE_IN_THRESHOLD_MS=10  # Response time for interruption

Error Handling¶

Standard Error Response Format¶

All endpoints return consistent error responses following RFC 7807:

{
  "detail": "Human-readable error description",
  "status_code": 400,
  "timestamp": "2025-09-28T12:00:00Z",
  "type": "validation_error",
  "instance": "/api/v1/calls/initiate",
  "errors": [
    {
      "field": "phone_number",
      "message": "Invalid phone number format",
      "code": "format_invalid"
    }
  ]
}

HTTP Status Codes¶

Status	Description	Common Causes
200	Success	Request completed successfully
202	Accepted	Async operation initiated
400	Bad Request	Invalid request format or parameters
401	Unauthorized	Missing or invalid authentication
403	Forbidden	Insufficient permissions or RBAC roles
404	Not Found	Resource not found
422	Validation Error	Request body schema validation failed
429	Rate Limited	Azure service quota exceeded
500	Internal Server Error	Unexpected server error
502	Bad Gateway	Azure service unavailable
503	Service Unavailable	Dependencies not ready
504	Gateway Timeout	Azure service timeout

Service-Specific Errors¶

Azure Speech Services: - speech_quota_exceeded - API rate limit reached - speech_region_unavailable - Speech service region down - audio_format_unsupported - Invalid audio format specified

Azure Communication Services: - call_not_found - Call connection ID invalid - media_streaming_failed - WebSocket streaming error - pstn_number_invalid - Phone number format error

Azure Cache for Redis: - redis_connection_failed - Redis cluster unavailable - session_expired - Session data TTL exceeded

Retry Strategy¶

The API implements exponential backoff for transient errors:

# Retry configuration
RETRY_MAX_ATTEMPTS=3
RETRY_BACKOFF_FACTOR=2.0
RETRY_JITTER=true

# Service-specific timeouts
SPEECH_REQUEST_TIMEOUT=30
ACS_CALL_TIMEOUT=60
REDIS_OPERATION_TIMEOUT=5

📖 Reference: Azure Service reliability patterns

Getting Started¶

Quick Setup¶

Configure Authentication:

export AZURE_TENANT_ID="your-tenant-id"
export AZURE_CLIENT_ID="your-client-id"
export AZURE_CLIENT_SECRET="your-client-secret"

Set Required Environment Variables:

export AZURE_SPEECH_REGION="eastus"
export ACS_ENDPOINT="https://your-acs.communication.azure.com"
export AZURE_REDIS_HOSTNAME="your-redis.redis.cache.windows.net"

Test Health Endpoint:

curl -X GET https://api.domain.com/api/v1/health/

Initiate a Test Call:

curl -X POST https://api.domain.com/api/v1/calls/initiate \
  -H "Content-Type: application/json" \
  -d '{"phone_number": "+1234567890"}'

Development Resources¶

Interactive API Explorer - Test all endpoints directly in browser
Streaming Modes - WebSocket connection examples
Local Development - Development setup and configuration
Architecture Overview - System design and deployment patterns

Production Considerations¶

Use managed identity authentication in Azure deployments
Configure connection pooling for high-throughput scenarios
Enable distributed tracing with Azure Monitor integration
Implement health checks for all dependent services
Set up monitoring and alerting for service reliability

📖 Reference: Production deployment guide