Azure · 007DXR · May 9, 2026
@@ -4,6 +4,20 @@
 
 ### Features Added
 
+- `invocations_ws` (WebSocket) protocol support on `InvocationAgentServerHost`.
+  Register a handler with the new `@app.ws_handler` decorator to host a
+  full-duplex WebSocket endpoint at `/invocations_ws` on the same host that
+  serves `POST /invocations`. The SDK calls `await websocket.accept()` before
+  invoking the handler, runs WebSocket Ping/Pong keep-alive in the background
+  (default 30 s; configurable via the new `ws_ping_interval` constructor
+  argument), closes the connection cleanly on handler return, and maps
+  uncaught exceptions to RFC 6455 close code `1011`. Each connection emits a
+  structured close-event log line carrying `ws.session_id`, `ws.close_code`,
+  and `ws.duration_ms`, and the same fields are recorded as OpenTelemetry
+  span attributes. `/readiness`, OTEL export, graceful shutdown, and the
+  `x-platform-server` identity header continue to be inherited from
+  `azure-ai-agentserver-core`.
+
 ### Breaking Changes
 
 ### Bugs Fixed

@@ -1,6 +1,9 @@
 # Azure AI Agent Server Invocations client library for Python
 
-The `azure-ai-agentserver-invocations` package provides the invocation protocol endpoints for Azure AI Hosted Agent containers. It plugs into the [`azure-ai-agentserver-core`](https://pypi.org/project/azure-ai-agentserver-core/) host framework and adds the full invocation lifecycle: `POST /invocations`, `GET /invocations/{id}`, `POST /invocations/{id}/cancel`, and `GET /invocations/docs/openapi.json`.
+The `azure-ai-agentserver-invocations` package provides the invocation protocol endpoints for Azure AI Hosted Agent containers. It plugs into the [`azure-ai-agentserver-core`](https://pypi.org/project/azure-ai-agentserver-core/) host framework and supports two transports on the same host:
+
+- **HTTP** (`invocations` protocol) — `POST /invocations`, `GET /invocations/{id}`, `POST /invocations/{id}/cancel`, `GET /invocations/docs/openapi.json`.
+- **WebSocket** (`invocations_ws` protocol) — full-duplex streaming at `/invocations_ws`, registered with `@app.ws_handler`.
 
 ## Getting started
 
@@ -25,6 +28,7 @@ This automatically installs `azure-ai-agentserver-core` as a dependency.
 - `@app.invoke_handler` — **Required.** Handles `POST /invocations`.
 - `@app.get_invocation_handler` — Optional. Handles `GET /invocations/{id}`.
 - `@app.cancel_invocation_handler` — Optional. Handles `POST /invocations/{id}/cancel`.
+- `@app.ws_handler` — Optional. Handles WebSocket connections at `/invocations_ws`.
 
 ### Protocol endpoints
 
@@ -34,6 +38,7 @@ This automatically installs `azure-ai-agentserver-core` as a dependency.
 | `GET` | `/invocations/{invocation_id}` | No | Retrieve invocation status or result |
 | `POST` | `/invocations/{invocation_id}/cancel` | No | Cancel a running invocation |
 | `GET` | `/invocations/docs/openapi.json` | No | Serve the agent's OpenAPI 3.x spec |
+| `WS`   | `/invocations_ws` | No | Full-duplex WebSocket transport (`invocations_ws` protocol) |
 
 ### Request and response headers
 
@@ -182,6 +187,57 @@ app = InvocationAgentServerHost(openapi_spec={
 })
 ```
 
+## WebSocket protocol (`invocations_ws`)
+
+The same `InvocationAgentServerHost` object also exposes a WebSocket transport at `/invocations_ws`. Container authors do not install or import a second package — registering an `@app.ws_handler` is the only step. A multi-protocol agent shares one host, one session, and one process.
+
+### Quick start
+
+```python
+from azure.ai.agentserver.invocations import InvocationAgentServerHost
+from starlette.requests import Request
+from starlette.responses import JSONResponse, Response
+from starlette.websockets import WebSocket
+
+app = InvocationAgentServerHost()
+
+
+@app.invoke_handler                 # POST /invocations (HTTP)
+async def invoke(request: Request) -> Response:
+    payload = await request.json()
+    return JSONResponse({"echo": payload})
+
+
+@app.ws_handler                     # /invocations_ws (WebSocket)
+async def ws(websocket: WebSocket) -> None:
+    async for message in websocket.iter_text():
+        await websocket.send_text(message)
+
+
+app.run()
+```
+
+### What the SDK does for `@app.ws_handler`
+
+- Registers `/invocations_ws` on the same Starlette host as `/invocations` and `/readiness`.
+- Calls `await websocket.accept()` before invoking your handler.
+- Runs WebSocket Ping/Pong keep-alive in the background — default 30 s, configurable via `InvocationAgentServerHost(ws_ping_interval=...)`. Set `ws_ping_interval=0` to disable. Frames are sent at the WebSocket protocol layer (RFC 6455 opcode `0x9`/`0xA`) by the underlying Hypercorn server, which keeps the connection alive across Azure APIM and Azure Load Balancer's ~4 minute idle timeout without any extra application traffic.
+- Closes the connection cleanly on handler return (close code `1000`) or maps an uncaught handler exception to close code `1011`.
+- Emits a structured close-event log line carrying `ws.session_id`, `ws.close_code`, and `ws.duration_ms`. The same fields are recorded as OpenTelemetry span attributes so the connection lifetime is visible end-to-end.
+- Inherits `/readiness`, OpenTelemetry export, graceful shutdown, and the `x-platform-server` identity header from `azure-ai-agentserver-core`.
+
+### Handler signature
+
+The handler receives a Starlette [`WebSocket`][starlette-ws] and returns `None`. The full WebSocket API — `iter_text`, `iter_bytes`, `iter_json`, `send_text`, `send_bytes`, `send_json`, `close`, `headers`, `query_params`, `client`, `state` — is available, so application protocols on top of `invocations_ws` are entirely under your control.
+
+[starlette-ws]: https://www.starlette.io/websockets/
+
+### Reference: configuration
+
+| Constructor argument | Default | Description |
+|---|---|---|
+| `ws_ping_interval` | `30.0` (seconds) | WebSocket protocol Ping interval. `0` disables keep-alive. Negative or non-finite values are rejected. |
+
 ## Troubleshooting
 
 ### Reporting issues
@@ -196,6 +252,8 @@ Visit the [Samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/
 |---|---|
 | [simple_invoke_agent](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-invocations/samples/simple_invoke_agent/) | Minimal synchronous request-response |
 | [async_invoke_agent](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-invocations/samples/async_invoke_agent/) | Long-running operations with polling and cancellation |
+| [ws_invoke_agent](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-invocations/samples/ws_invoke_agent/) | Combined `POST /invocations` (HTTP) and `/invocations_ws` (WebSocket) host |
+| [ws_bidirectional_streaming_agent](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/agentserver/azure-ai-agentserver-invocations/samples/ws_bidirectional_streaming_agent/) | Full-duplex `/invocations_ws` agent: server-pushed heartbeats + concurrent token streams + mid-flight cancel |
 
 ## Contributing
 

@@ -19,3 +19,31 @@ class InvocationConstants:
     ATTR_SPAN_SESSION_ID = "azure.ai.agentserver.invocations.session_id"
     ATTR_SPAN_ERROR_CODE = "azure.ai.agentserver.invocations.error.code"
     ATTR_SPAN_ERROR_MESSAGE = "azure.ai.agentserver.invocations.error.message"
+
+
+class InvocationsWSConstants:
+    """invocations_ws (WebSocket) protocol constants.
+
+    Route, span attribute keys, and ping/pong defaults for the
+    WebSocket endpoint hosted alongside the HTTP invocations protocol.
+    """
+
+    # Route
+    ROUTE_PATH = "/invocations_ws"
+
+    # Default WebSocket Ping interval in seconds.
+    # Azure APIM and Azure Load Balancer drop idle WebSocket connections
+    # after ~4 minutes; 30 s gives a comfortable safety margin.
+    DEFAULT_PING_INTERVAL_S = 30.0
+
+    # Close codes (RFC 6455)
+    CLOSE_NORMAL = 1000  # handler returned cleanly
+    CLOSE_INTERNAL_ERROR = 1011  # handler raised an unhandled exception
+    CLOSE_SERVICE_RESTART = 1012  # graceful shutdown drained the connection
+
+    # Span attribute keys
+    ATTR_SPAN_SESSION_ID = "ws.session_id"
+    ATTR_SPAN_CLOSE_CODE = "ws.close_code"
+    ATTR_SPAN_DURATION_MS = "ws.duration_ms"
+    ATTR_SPAN_ERROR_CODE = "azure.ai.agentserver.invocations_ws.error.code"
+    ATTR_SPAN_ERROR_MESSAGE = "azure.ai.agentserver.invocations_ws.error.message"
@@ -31,6 +31,7 @@
 )
 
 from ._constants import InvocationConstants
+from ._invocation_ws import _WSHandlerMixin
 
 logger = logging.getLogger("azure.ai.agentserver")
 
@@ -93,13 +94,18 @@ def _sanitize_id(value: str, fallback: str) -> str:
     return value
 
 
-class InvocationAgentServerHost(AgentServerHost):
+class InvocationAgentServerHost(_WSHandlerMixin, AgentServerHost):
     """Invocation protocol host for Azure AI Hosted Agents.
 
     A :class:`~azure.ai.agentserver.core.AgentServerHost` subclass that adds
     the invocation protocol endpoints.  Use the decorator methods to wire
     handler functions to the endpoints.
 
+    The same host object also exposes the ``invocations_ws`` (WebSocket)
+    transport at :data:`/invocations_ws` — register a handler with the
+    :meth:`ws_handler` decorator.  Multi-protocol agents share a single
+    host, session, and process.
+
     For multi-protocol agents, compose via cooperative inheritance::
 
         class MyHost(InvocationAgentServerHost, ResponsesAgentServerHost):
@@ -108,18 +114,29 @@ class MyHost(InvocationAgentServerHost, ResponsesAgentServerHost):
     Usage::
 
         from azure.ai.agentserver.invocations import InvocationAgentServerHost
+        from starlette.websockets import WebSocket
 
         app = InvocationAgentServerHost()
 
-        @app.invoke_handler
+        @app.invoke_handler                  # POST /invocations
         async def handle(request):
             return JSONResponse({"ok": True})
 
+        @app.ws_handler                      # /invocations_ws
+        async def ws(websocket: WebSocket) -> None:
+            async for message in websocket.iter_text():
+                await websocket.send_text(message)
+
         app.run()
 
     :param openapi_spec: Optional OpenAPI spec dict.  When provided, the spec
         is served at ``GET /invocations/docs/openapi.json``.
     :type openapi_spec: Optional[dict[str, Any]]
+    :param ws_ping_interval: Seconds between WebSocket protocol Ping frames
+        on ``/invocations_ws``.  ``None`` (default) selects 30 s; ``0``
+        disables keep-alive.  Configured on the underlying Hypercorn
+        server so the framing is opcode 0x9 / 0xA, not application JSON.
+    :type ws_ping_interval: Optional[float]
     """
 
     _INSTRUMENTATION_SCOPE = "Azure.AI.AgentServer.Invocations"
@@ -128,15 +145,19 @@ def __init__(
         self,
         *,
         openapi_spec: Optional[dict[str, Any]] = None,
+        ws_ping_interval: Optional[float] = None,
         **kwargs: Any,
     ) -> None:
         self._invoke_fn: Optional[Callable] = None
         self._get_invocation_fn: Optional[Callable] = None
         self._cancel_invocation_fn: Optional[Callable] = None
         self._openapi_spec = openapi_spec
 
+        # Initialise WS handler slots (raises ValueError on a bad interval).
+        self._init_ws_state(ws_ping_interval)
+
         # Build invocation routes and pass to parent via routes kwarg
-        invocation_routes = [
+        invocation_routes: list[Any] = [
             Route(
                 "/invocations/docs/openapi.json",
                 self._get_openapi_spec_endpoint,
@@ -161,6 +182,7 @@ def __init__(
                 methods=["POST"],
                 name="cancel_invocation",
             ),
+            self._build_ws_route(self._ws_endpoint),
         ]
 
         # Merge with any routes from sibling mixins via cooperative init
@@ -169,10 +191,53 @@ def __init__(
 
         # --- Invocations startup configuration logging ---
         logger.info(
-            "Invocations protocol: openapi_spec_configured=%s",
+            "Invocations protocol: openapi_spec_configured=%s, "
+            "ws_ping_interval=%s",
             self._openapi_spec is not None,
+            (
+                "disabled"
+                if self._ws_ping_interval == 0
+                else f"{self._ws_ping_interval}s"
+            ),
         )
 
+    # ------------------------------------------------------------------
+    # Hypercorn server config (WebSocket Ping/Pong keep-alive)
+    # ------------------------------------------------------------------
+
+    def _build_hypercorn_config(self, host: str, port: int) -> object:
+        """Extend the base Hypercorn config with the WebSocket Ping interval.
+
+        Hypercorn sends WS protocol Ping frames every
+        ``websocket_ping_interval`` seconds on every active WebSocket
+        connection — exactly the keep-alive the ``invocations_ws`` spec
+        requires.  ``ws_ping_interval=0`` leaves the default
+        ``None`` (disabled).
+
+        :param host: Network interface to bind.
+        :type host: str
+        :param port: Port to bind.
+        :type port: int
+        :return: The configured Hypercorn config.
+        :rtype: hypercorn.config.Config
+        """
+        config = super()._build_hypercorn_config(host, port)
+        if self._ws_ping_interval and self._ws_ping_interval > 0:
+            try:
+                # ``websocket_ping_interval`` is a float-or-None on
+                # Hypercorn ≥0.14; assigning a positive float enables
+                # protocol-level Ping frames.
+                config.websocket_ping_interval = self._ws_ping_interval  # type: ignore[attr-defined]
+            except Exception:  # pylint: disable=broad-exception-caught
+                # Hypercorn <0.14 does not support per-server WS ping —
+                # leave the default and warn so operators can upgrade.
+                logger.warning(
+                    "Hypercorn does not support websocket_ping_interval; "
+                    "WebSocket keep-alive will be best-effort.",
+                    exc_info=True,
+                )
+        return config
+
     # ------------------------------------------------------------------
     # Handler decorators
     # ------------------------------------------------------------------