feat: API key schema isolation — database-level tenant separation#855
feat: API key schema isolation — database-level tenant separation#855salvormallow wants to merge 1 commit intovectorize-io:mainfrom
Conversation
Adds ApiKeySchemaTenantExtension: maps API keys to isolated PostgreSQL schemas, providing database-level memory isolation between tenants. Threat model: prompt injection against AI agents. Agents execute tool calls based on conversation content. A prompt injection can trick an agent into querying another tenant's banks. Schema isolation scopes all SQL to the authenticated schema — banks from other schemas don't exist. Configuration: HINDSIGHT_API_TENANT_EXTENSION=...bank_scoped_tenant:ApiKeySchemaTenantExtension HINDSIGHT_API_TENANT_KEY_MAP=key_a:schema_a;key_b:schema_b Follows the SupabaseTenantExtension pattern. Opt-in, zero breaking changes. Includes 20 tests.
Dashboard caveatWhen Root cause: Workaround: Set Longer-term: The dashboard should support multi-tenant awareness — a tenant selector that switches which API key is used for dataplane calls. I'm working on a follow-up PR for this. |
CI StatusFixed: Expected fork failures: The remaining ~15 failing jobs ( These tests would need a trusted CI re-run from a maintainer to pass. |
Summary
Adds
ApiKeySchemaTenantExtension— a built-in tenant extension that maps API keys to isolated PostgreSQL schemas, providing database-level memory isolation between tenants. Follows the same pattern asSupabaseTenantExtensionbut uses static API key mapping instead of JWT auth.Threat model: prompt injection against AI agents
AI agents execute tool calls — including Hindsight
recall,retain, andreflect— based on conversation content. A prompt injection delivered via chat message, email, or web search result can trick an agent into querying another tenant's memory banks.Example attack:
hindsight recall --bank tenant-b-bank --query "private data"Why application-layer bank filtering isn't enough:
RequestContext.allowed_bank_idsexists on the model but is not enforced by the engine. AnOperationValidatorExtensioncould check it, but:allowed_bank_idsisNone(the default), all access is grantedallowed_bank_idsis never set for background tasksWhy schema isolation works:
The API key determines the PostgreSQL schema at authentication time, before any bank lookup or query executes. The SQL itself is scoped via fully-qualified table names. Even a fully compromised agent can only access banks within its assigned schema. Banks from other schemas don't exist in its view of the database.
How it works
SupabaseTenantExtension)Configuration
Design decisions
Opt-in, zero breaking changes. If
HINDSIGHT_API_TENANT_EXTENSIONis not set, Hindsight usesDefaultTenantExtension— identical to current behavior. Existing deployments are unaffected.One key = one schema. Each API key maps to exactly one PostgreSQL schema. A single key cannot access multiple schemas. This is intentional: one key = one blast radius. The
TenantContextreturns a singleschema_name, and the engine scopes all queries to it. Cross-schema queries are not possible without direct Postgres access.Admin access. There is no "superuser key" that spans all schemas. Operators who need cross-tenant visibility should query Postgres directly or use separate keys per schema. This is a conscious trade-off: admin convenience vs. the guarantee that no single compromised key grants access to all tenants.
MCP auth disabled = default schema only. When
mcp_auth_disabled=true, MCP requests fall back to the default schema (fromHINDSIGHT_API_DATABASE_SCHEMA), not a tenant schema.Schema name validation. Schema names must be valid Postgres identifiers (letters, digits, underscores). Hyphens, spaces, and names starting with digits are rejected at startup.
Why not
allowed_bank_ids+OperationValidatorExtension? See threat model above. Application-layer checks are defense-in-depth, not a security boundary. Schema isolation moves the enforcement into the database where it can't be bypassed by missed code paths.Files changed
hindsight-api-slim/.../builtin/bank_scoped_tenant.pyApiKeySchemaTenantExtension(~170 lines)hindsight-api-slim/tests/test_bank_scoped.pyTest plan
run_migrationworks on first authenticated requestpublicschema reads existing data correctly