Problem
The current LlmConfig hierarchy needs a dedicated class per provider (OpenAiConfig, OpenAiCompatibleConfig, VllmConfig, OllamaConfig, OciGenAiConfig). This doesn't scale:
- Every new provider requires a spec change. Anthropic, Vertex AI, Bedrock, Azure OpenAI, Mistral, Groq, Together AI all need new classes, SDK updates, adapter updates, and a new spec version.
- "OpenAI Compatible" isn't universal. Anthropic has its own Messages API, Bedrock uses IAM auth, Azure needs deployment names and API versions, Vertex uses GCP service accounts. Forcing these into an OpenAI-compatible shape loses important details.
- Auth is too narrow. Only
api_key is supported. Real deployments need IAM roles, service accounts, managed identity, OIDC tokens, or no auth at all.
- Generation parameters are incomplete. Only
max_tokens, temperature, and top_p are defined. No guidance for top_k, stop_sequences, seed, frequency_penalty, response_format, json_schema.
- Provider identity is tangled up with wire protocol. OCI GenAI can speak OpenAI Chat Completions. Azure OpenAI is OpenAI's API with different auth/endpoints. The spec can't separate who the provider is from what protocol they use.
Proposal
Introduce a single provider-agnostic config using free-form string discriminators instead of per-provider classes:
provider.type string ("openai", "anthropic", "aws_bedrock", etc.) with extra fields for provider-specific options. Optional api_protocol hint for when the wire protocol differs from what the type implies.
- Separate
auth object with its own type discriminator and credential resolution ($env:VAR_NAME for env vars).
- Richer generation parameters with explicit fields for common options.
provider_extensions escape hatch for non-portable options.
Backward Compatibility
GenericLlmConfig is a new component_type that coexists with existing configs. No implicit mapping between them. Runtimes should support both, dispatching based on component_type through the existing PydanticComponentDeserializationPlugin system.
Examples
Anthropic
component_type: GenericLlmConfig
model_id: "claude-sonnet-4-20250514"
provider:
type: "anthropic"
auth:
type: "api_key"
credential_ref: "$env:ANTHROPIC_API_KEY"
default_generation_parameters:
max_tokens: 4096
temperature: 0.7
AWS Bedrock (same shape, different provider)
component_type: GenericLlmConfig
model_id: "anthropic.claude-3-sonnet-20240229-v1:0"
provider:
type: "aws_bedrock"
region: "us-east-1"
auth:
type: "iam_role"
Local Ollama (no auth needed)
component_type: GenericLlmConfig
model_id: "llama3"
provider:
type: "ollama"
endpoint: "http://localhost:11434"
auth: null
Open model served from vLLM
component_type: GenericLlmConfig
model_id: "meta-llama/Llama-3.1-70B-Instruct"
provider:
type: "vllm"
endpoint: "http://gpu-cluster:8000"
api_protocol: "openai_chat_completions"
auth: null
Problem
The current
LlmConfighierarchy needs a dedicated class per provider (OpenAiConfig,OpenAiCompatibleConfig,VllmConfig,OllamaConfig,OciGenAiConfig). This doesn't scale:api_keyis supported. Real deployments need IAM roles, service accounts, managed identity, OIDC tokens, or no auth at all.max_tokens,temperature, andtop_pare defined. No guidance fortop_k,stop_sequences,seed,frequency_penalty,response_format,json_schema.Proposal
Introduce a single provider-agnostic config using free-form string discriminators instead of per-provider classes:
provider.typestring ("openai","anthropic","aws_bedrock", etc.) with extra fields for provider-specific options. Optionalapi_protocolhint for when the wire protocol differs from what the type implies.authobject with its own type discriminator and credential resolution ($env:VAR_NAMEfor env vars).provider_extensionsescape hatch for non-portable options.Backward Compatibility
GenericLlmConfigis a newcomponent_typethat coexists with existing configs. No implicit mapping between them. Runtimes should support both, dispatching based oncomponent_typethrough the existingPydanticComponentDeserializationPluginsystem.Examples
Anthropic
AWS Bedrock (same shape, different provider)
Local Ollama (no auth needed)
Open model served from vLLM