Skip to content

feat(decisioning): LazyPlatformRouter for tenant-on-first-request construction#552

Merged
bokelley merged 1 commit intomainfrom
bokelley/lazy-platform-router
May 4, 2026
Merged

feat(decisioning): LazyPlatformRouter for tenant-on-first-request construction#552
bokelley merged 1 commit intomainfrom
bokelley/lazy-platform-router

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented May 4, 2026

Summary

Closes #547. Drop-in alternative to PlatformRouter that defers per-tenant DecisioningPlatform construction to first request, with a bounded LRU + TTL cache. Closes the eager-router pain point for adopters with N tenants × per-tenant SDK auth handshake (Google Ad Manager service-account, Kevel API key) where boot scales O(N).

  • Async or sync factory; awaited via inspect.isawaitable, matching CallableSubdomainTenantRouter's convention (feat(server): CallableSubdomainTenantRouter for DB-backed tenant lookups #544)
  • Bounded cache: cache_size > 0 mandatory (default 256); cache_ttl_seconds >= 0 (default 3600.0; 0 = size-only eviction). Distinct from CallableSubdomainTenantRouter which rejects ttl=0 — tenants go stale, platform adapters don't (unless the factory reads mutable config — docstring calls that out)
  • invalidate(tenant_id=None) for hot-reload. Per-tenant + global generation counter snapshots prevent the invalidate-during-build race from resurrecting an evicted slot
  • Drop-in: isinstance(router, DecisioningPlatform) is true, serve() accepts it identically, ACCOUNT_NOT_FOUND / UNSUPPORTED_FEATURE projection matches PlatformRouter
  • No singleflight in v1 — concurrent cold requests each build (locked by test_concurrent_cold_requests_each_build_v1_contract)
  • platform_for_tenant() async sibling-API parity for admin/health endpoints
  • Extracted _select_proposal_method module-level so both routers share the same routing logic without drift

Test plan

  • 26 new tests in tests/test_lazy_platform_router.py cover: drop-in compat, lazy build, async/sync factory + child, LRU eviction, TTL expiry (with monkeypatched clock), cache_ttl_seconds=0 size-only mode, construction validation, factory rejection paths (None / wrong type / raises), invalidate semantics, invalidate-during-build race contract (specific tenant + global), thundering herd v1 contract (concurrent cold = both build), proposal_managers routing (manager vs fall-through), platform_for_tenant introspection, unknown-tenant + unsupported-method projection
  • Adjacent suites green: test_platform_router.py, test_proposal_manager.py, test_subdomain_tenant_router.py
  • Full unit suite green (3903 passed, 0 failures)
  • ruff check + mypy clean on changed files

Reviewed by

Pre-PR review by code-reviewer, ad-tech-protocol-expert, dx-expert. Material concerns addressed:

  • Cache write-after-await race → per-tenant + global generation counters; in-flight build whose tenant was invalidated does NOT resurrect the cache slot
  • _select_proposal_method duplication between PlatformRouter (method) and LazyPlatformRouter (module-level) → consolidated to module-level only
  • DX gap: platform_for_tenant() parity with eager router → added
  • DX nit: factory union typing footgun for coding agents → docstring with sync + async examples on the type alias
  • Test gaps: thundering herd contract, proposal_managers routing, recovery == "terminal" assertion → added

🤖 Generated with Claude Code

…struction

Drop-in alternative to PlatformRouter that defers per-tenant
DecisioningPlatform construction to first request, with a bounded LRU
+ TTL cache. Closes the eager-router pain point for adopters with
N tenants × per-tenant SDK auth handshake (Google Ad Manager
service-account, Kevel API key) where boot scales O(N).

* Async or sync factory; awaited via inspect.isawaitable, matching
  CallableSubdomainTenantRouter's convention (PR #544).
* Bounded cache: cache_size > 0 mandatory (default 256);
  cache_ttl_seconds >= 0 (default 3600.0; 0 = size-only eviction).
  Distinct from CallableSubdomainTenantRouter which rejects ttl=0 —
  there tenants go stale, here platform adapters don't (unless the
  factory reads mutable config — docstring calls that out).
* invalidate(tenant_id=None) for hot-reload. Per-tenant + global
  generation counter snapshots prevent the invalidate-during-build
  race from resurrecting an evicted slot.
* Drop-in: isinstance(router, DecisioningPlatform) is true,
  serve() accepts it identically, ACCOUNT_NOT_FOUND /
  UNSUPPORTED_FEATURE projection matches PlatformRouter.
* No singleflight in v1 — concurrent cold requests each build (locked
  by test_concurrent_cold_requests_each_build_v1_contract).
* proposal_managers stays an eager dict (cheap to hold).
* platform_for_tenant() async sibling-API parity for
  admin/health endpoints.
* Extracted _select_proposal_method module-level so PlatformRouter
  and LazyPlatformRouter share the same routing logic without drift.

Closes #547.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley merged commit 98fd456 into main May 4, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: LazyPlatformRouter for tenant-on-first-request platform construction

1 participant