Skip to content

[BUG] SageMakerAIModel does not support SageMaker Async Inference endpoints #1969

@tomo-vzc

Description

@tomo-vzc

Checks

  • I have updated to the lastest minor and patch version of Strands
  • I have checked the documentation and this is not expected behavior
  • I have searched ./issues and there are no duplicates of my issue

Strands Version

1.33.0

Python Version

3.13.12

Operating System

Ubuntu

Installation Method

pip

Steps to Reproduce

  1. Deploy any model to a SageMaker endpoint with AsyncInferenceConfig:
from sagemaker.core.inference_config import AsyncInferenceConfig

async_config = AsyncInferenceConfig(
   output_path="s3://my-bucket/async-outputs/",
)

model_builder.deploy(
   instance_type="ml.g5.xlarge",
   initial_instance_count=1,
   endpoint_name="my-async-endpoint",
   inference_config=async_config,
)
  1. Use SageMakerAIModel to invoke it:
from strands import Agent
from strands.models.sagemaker import SageMakerAIModel

model = SageMakerAIModel(
    endpoint_config={
        "endpoint_name": "my-async-endpoint",
        "region_name": "us-west-2",
    },
    payload_config={"max_tokens": 100, "stream": False},
)

agent = Agent(model=model)
result = agent("Say hello.")  # raises ValidationError
  1. Both stream=True and stream=False fail with ValidationError

Expected Behavior

Responce

Actual Behavior

FAILED: ValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Endpoint XXXX does not support this inference type.

Motivation

SageMaker Async Inference seems to me the recommended approach for long-running inference (>60s), which is common for large LLMs with long prompts. The 60-second real-time timeout on invoke_endpoint makes it unusable for many production workloads.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions