| layout | title |
|---|---|
default |
System Services Overview |
ModelMesh Lite is composed of a set of cooperating runtime services that together implement capability-based routing, model lifecycle management, provider abstraction, and observability. This document describes the overall architecture, initialization sequence, request flow, and service groupings. For conceptual foundations see SystemConcept.md; for YAML configuration see SystemConfiguration.md.
ModelMesh (facade)
├── Router
│ ├── RoutingPipeline
│ │ ├── CapabilityResolver → CapabilityTree
│ │ ├── DeliveryFilter
│ │ ├── StateFilter
│ │ └── SelectionStrategy
│ ├── RetryPolicy
│ └── CapabilityPool[]
│ ├── Model[] → ModelState
│ ├── Provider[] → ProviderState
│ └── RotationPolicy
│ ├── DeactivationEvaluator
│ ├── RecoveryEvaluator
│ └── SelectionStrategy
├── ModelRegistry
├── ConnectorRegistry
├── OpenAIClient → Router
├── ProxyServer → Router
├── EventEmitter → ObservabilityConnector[]
├── RequestLogger → ObservabilityConnector[]
├── StatisticsCollector → ObservabilityConnector[]
├── SecretResolver → SecretStoreConnector
└── StateManager → StorageConnector
The following steps execute in order when ModelMesh.initialize(config) is called:
- Parse configuration -- Load YAML or programmatic configuration into the internal
MeshConfigstructure. - Register connectors -- Instantiate the
ConnectorRegistryand load all built-in and custom connector packages. - Resolve secrets -- Initialize the
SecretResolverwith the configured secret store connector. Resolve all${secrets:name}references in the configuration. - Load persisted state -- Initialize the
StateManagerwith the configured storage connector. Callload()to restoreModelState,ProviderState, and pool memberships from the previous session. - Build capability tree -- Construct the
CapabilityTreefrom the default hierarchy plus any custom extensions declared in configuration. - Register models and providers -- Populate the
ModelRegistrywith model definitions from configuration. InstantiateProviderwrappers for each configured provider. - Build capability pools -- Create
CapabilityPoolinstances for each configured pool, assign models based on capability node membership, and attachRotationPolicycomponents. - Wire the routing pipeline -- Assemble the
RoutingPipelinewith default stages (CapabilityResolver, pool selection, DeliveryFilter, StateFilter, SelectionStrategy, RetryPolicy). - Initialize the router -- Create the
Routerwith the assembled pipeline and pool set. - Start observability services -- Initialize
EventEmitter,RequestLogger, andStatisticsCollectorwith their configured observability connectors. - Start background services -- Launch discovery sync, health monitor probes, periodic state sync, and statistics flush timers.
A typical synchronous completion request follows this path:
- The application calls
OpenAIClient.chat.completions.create(model="text-generation", messages=[...])or sends an HTTP request toProxyServeratPOST /v1/chat/completions. - The virtual model name
"text-generation"is passed toRouter.complete()as the capability identifier. - The
RouterinvokesRoutingPipeline.execute(), which runs each stage in sequence:- CapabilityResolver maps
"text-generation"to matchingCapabilityPoolinstances using theCapabilityTree. - Pool selection chooses the target pool (single match or priority-based).
- DeliveryFilter excludes models that do not support the requested delivery mode (sync, streaming, or batch).
- StateFilter excludes standby models and models from deactivated providers.
- SelectionStrategy scores remaining candidates and selects the best model.
- CapabilityResolver maps
- The
Routersends the request to the selectedProvider.execute(), which delegates to the underlying provider connector. - On success, the response flows back through the
Router.RequestLoggerrecords the request,StatisticsCollectorbuffers metrics, andModelStateis updated. - On failure,
RetryPolicydetermines whether to retry the same model (with backoff) or rotate to the next candidate. If deactivation thresholds are reached, theRotationPolicymoves the model to standby andEventEmitterpublishes amodel_deactivatedevent.
| Group | Services | Purpose |
|---|---|---|
| Facade | ModelMesh | Library entry point; initializes and wires all subsystems |
| Routing | Router, RoutingPipeline, CapabilityResolver, DeliveryFilter, StateFilter, RetryPolicy | Request orchestration, pipeline stages, and retry logic |
| Pools & Models | CapabilityPool, Model, ProviderService, ModelState, ProviderState | Model grouping, runtime state, and provider abstraction |
| Rotation | RotationPolicyService, DeactivationEvaluator, RecoveryEvaluator, SelectionStrategy | Deactivation, recovery, and selection governance |
| Registries | ModelRegistry, ConnectorRegistry, CapabilityTree | Model catalogue, connector catalogue, capability hierarchy |
| External Interfaces | OpenAIClient, ProxyServer | Application-facing API surfaces |
| Observability | EventEmitter, RequestLogger, StatisticsCollector | Events, logging, and metrics |
| Infrastructure | SecretResolver, StateManager | Secret resolution and state persistence |
- SystemConcept.md -- Conceptual architecture, design principles, and capability model
- SystemConfiguration.md -- Full YAML configuration reference
- SystemServices.md -- Consolidated service reference (source for individual docs)
- ConnectorInterfaces.md -- Connector API contracts