feat(rbac): foundation — catalog, schema, sqlc, seed, backfill (PR 1 of 4)#283
feat(rbac): foundation — catalog, schema, sqlc, seed, backfill (PR 1 of 4)#283mcharles-square wants to merge 18 commits into
Conversation
Adds the technical design for moving Proto Fleet from the current SUPER_ADMIN/ADMIN/RequireAdmin gate to a granular permission model with scoped role assignments, editable built-in roles, and per-org custom roles. Key decisions baked in: - Three seeded roles: SUPER_ADMIN (immutable), ADMIN and FIELD_TECH (editable by SUPER_ADMIN, not deletable) - Permission catalog plus read-pairing rule (action perms require matching read perms; enforced at role save) - Site-scope narrowing semantics (site-scoped assignment overrides org-scoped grant at that site; org grant still applies elsewhere) - Filter-don't-reject list semantics for site-scoped readers - Route-guard parity: every primary-nav entry has a permission predicate; nav hides AND route redirects when false - Additive-only startup reconciliation for ADMIN/FIELD_TECH so operator edits survive upgrades; SUPER_ADMIN fully reconciled - Two error codes: BUILTIN_ROLE_IMMUTABLE (SUPER_ADMIN only) and BUILTIN_ROLE_NON_DELETABLE (ADMIN/FIELD_TECH delete attempts)
Add the canonical permission catalog as a Go code constant in server/internal/domain/authz/catalog.go and the matching Permission wire shape in proto/authz/v1/authz.proto. The catalog is the single source of truth for U2's seed migration, U4's startup reconciliation, U6's resolver lookup, U8's role-save validation, and U10's client-side gating. 29 permissions across 10 resource groups (fleet, miner, rack, site, serverlog, curtailment, fleetnode, apikey, user, role). No apikey:read — the API Keys page lives under route-guarded Settings, so a viewer-only split has no reachable UI. Tests cover catalog completeness (every Perm* constant is in the catalog), key shape (^[a-z]+:[a-z_]+$), fresh-copy semantics, and resource grouping.
Migration 000051 lays the schema foundation for the RBAC v2 work: - ALTER role ADD is_builtin, builtin_key (stable identifier code uses) - CREATE permission (catalog) — reconciled at startup from domain/authz/catalog.go, U4 will land the reconciliation - CREATE role_permission (join, ON DELETE RESTRICT on permission_id so obsolete catalog rows cannot silently drop) - CREATE user_organization_role (multi-assignment with org/site scope, composite FK to site(id, org_id) for DB-enforced tenant isolation, DEFERRABLE so transactional reassignments work) Existing user_organization.role_id is preserved unchanged; U5 (migration 000053) will backfill assignments and neutralize the legacy column with a raising trigger. Drop happens in U12 after a soak window.
…s (U3) Adds sqlc queries used by the resolver (U6), seed reconciliation (U4), and role management RPCs (U8): - permission.sql — list/get/upsert catalog rows, role-permission insert/delete, and the SUPER_ADMIN PrunePermissionsOutsideKeys used by full reconciliation - user_organization_role.sql — AssignRole/UnassignRole, list helpers, ListEffectivePermissionsForUser (the single-query resolver source), and the last-org-scope-SUPER_ADMIN guards used by U8 - role.sql extensions — ListBuiltinRoles, UpsertBuiltinRole, CreateCustomRole, UpdateCustomRoleName, SoftDeleteCustomRole. Custom-mutation queries filter `builtin_key IS DISTINCT FROM 'SUPER_ADMIN'`; SoftDeleteCustomRole is locked to is_builtin=FALSE (delete locked for every built-in) `just gen` produces clean bindings; server still builds.
Migration 000052 seeds the 29-permission catalog and the three
built-in role rows (SUPER_ADMIN, ADMIN, FIELD_TECH) so a fresh
install has usable state before the server's Go-level reconciler
runs.
server/internal/domain/authz/builtin.go declares the in-code seed
formulas — SUPER_ADMIN gets AllPermissions() and is the only role
reconciled fully; ADMIN gets AllPermissions() minus {user:*,
role:manage} via a code formula; FIELD_TECH gets an explicit set
(fleet:read, miner:read, miner:blink_led, miner:download_logs,
rack:read, rack:manage). ADMIN and FIELD_TECH are reconciled
additive-only so operator edits via UpdateCustomRole (U8) persist.
reconcile.go runs inside a single transaction holding
pg_advisory_xact_lock keyed on hashtextextended('authz:builtin_reconcile')
so rolling deploys and autoscaler events serialize cleanly.
Convergence uses idempotent inserts plus a targeted prune for
SUPER_ADMIN only — no delete-then-insert window where SUPER_ADMIN
has zero rows.
main.go invokes authz.Reconcile after db.ConnectAndMigrate and before
any service is constructed, so no request can be served before
built-in roles are reconciled.
Tests cover: fresh install (all three roles, correct permission
counts), idempotency, operator-edit survival on ADMIN and
FIELD_TECH, SUPER_ADMIN tamper repair, obsolete-permission pruning
on SUPER_ADMIN, concurrent reconciles. DB-dependent tests need
Postgres running locally (just dev) — non-DB catalog tests pass
green.
…ion_role (U5) Migration 000053 mirrors every active user_organization row into the new user_organization_role table as an org-scope assignment, so users keep their existing access once the resolver in PR 2 starts evaluating assignments instead of the legacy role_id column. Soft-deleted user_organization rows are skipped. The down migration soft-deletes only the rows the up migration created — assignments granted via the new RPCs post-backfill survive a rollback. Plan deviation called out explicitly in the migration comment: the plan's U5 also renamed user_organization.role_id to role_id_deprecated_do_not_use and added a raising trigger on non-NULL writes. That part is moved to PR 2's opening migration so it lands together with the caller swap in U6/U7. Doing the rename here would break the existing onboarding flow (CreateUserOrganization, UpdateUserRole) the moment PR 1 deploys, which would leave PR 1 unshippable without PR 2 already in flight. The plan's intent — loud failure during the soak window — still holds, just inside PR 2's boundary. Tests cover the backfill INSERT against a real Postgres (testutil.GetTestDB), exercise the ON CONFLICT path with a re-run, prove soft-deleted rows are not copied, and confirm idempotency. DB-dependent; needs Docker + just dev to execute locally.
…int) The intrange linter flagged the concurrent-reconcile test's classic three-clause for. Cosmetic only.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f33566ac9e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
🔐 Codex Security Review
Review SummaryOverall Risk: HIGH Findings[HIGH] Non-super-admin sessions can create a new ADMIN user
[MEDIUM] Last-SUPER_ADMIN guards count deactivated users
NotesNo cryptostealing/pool hijack changes, command execution changes, plugin boundary changes, discovery changes, or infrastructure changes were present in the reviewed diff. Generated by Codex Security Review | |
There was a problem hiding this comment.
Pull request overview
Lays the foundation for the upcoming granular RBAC rollout by introducing a permission catalog, the supporting database schema + sqlc bindings, seed/backfill migrations, and a startup reconciliation step to keep built-in roles and the catalog converged over time (while keeping RequireAdmin as the active authorization path for now).
Changes:
- Added RBAC v2 schema:
permission,role_permission, anduser_organization_role(scoped assignments) plus built-in awareness onrole. - Seeded the initial permission catalog + built-in roles via migration and added a boot-time reconciler to keep catalog/roles in sync.
- Backfilled org-scope assignments from legacy
user_organizationto preserve existing access.
Reviewed changes
Copilot reviewed 18 out of 26 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| server/sqlc/queries/user_organization_role.sql | New sqlc queries for scoped role assignments and effective-permission joins. |
| server/sqlc/queries/role.sql | Adds built-in role lookups and custom-role CRUD queries. |
| server/sqlc/queries/permission.sql | Adds catalog upsert/lookups and role-permission mutation helpers. |
| server/migrations/000051_create_permission_tables.up.sql | Creates permission/role-permission/assignment tables; adds built-in columns to role. |
| server/migrations/000051_create_permission_tables.down.sql | Drops RBAC v2 tables and removes built-in columns from role. |
| server/migrations/000052_seed_builtin_roles.up.sql | Seeds permission rows and built-in roles (SUPER_ADMIN/ADMIN/FIELD_TECH) + role_permission seeds. |
| server/migrations/000052_seed_builtin_roles.down.sql | Attempts to undo the seed state for roles/permissions. |
| server/migrations/000053_backfill_user_organization_to_assignments.up.sql | Backfills user_organization_role from legacy user_organization. |
| server/migrations/000053_backfill_user_organization_to_assignments.down.sql | Soft-deletes backfilled assignment rows on rollback. |
| server/internal/domain/authz/reconcile.go | New boot-time reconciler for catalog + built-in roles/permissions. |
| server/internal/domain/authz/reconcile_test.go | DB-dependent tests for reconcile correctness, idempotency, and concurrency. |
| server/internal/domain/authz/catalog.go | Introduces the in-code permission catalog and related helper APIs. |
| server/internal/domain/authz/catalog_test.go | Unit tests enforcing catalog key shape, uniqueness, and completeness. |
| server/internal/domain/authz/builtin.go | Declares built-in role specs and seed permission formulas. |
| server/internal/domain/authz/backfill_test.go | DB-dependent tests validating backfill behavior and idempotency. |
| server/cmd/fleetd/main.go | Calls authz.Reconcile(...) after migrations and before serving. |
| proto/authz/v1/authz.proto | New proto messages for permissions and permission groups. |
| server/generated/grpc/authz/v1/authz.pb.go | Generated Go types for the new authz.v1 protos. |
| client/src/protoFleet/api/generated/authz/v1/authz_pb.ts | Generated TS types for the new authz.v1 protos. |
| server/generated/sqlc/user_organization.sql.go | Regenerated sqlc output reflecting new role columns. |
| server/generated/sqlc/user_organization_role.sql.go | Generated sqlc bindings for user_organization_role queries. |
| server/generated/sqlc/role.sql.go | Generated sqlc bindings for new role queries/mutations. |
| server/generated/sqlc/permission.sql.go | Generated sqlc bindings for permission catalog and role-permission operations. |
| server/generated/sqlc/models.go | Generated sqlc models for Permission/RolePermission/UserOrganizationRole + updated Role model. |
| server/generated/sqlc/db.go | Generated sqlc statement preparation/close wiring for new queries. |
| docs/plans/2026-05-19-001-feat-granular-rbac-plan.md | Adds/updates the accepted granular RBAC plan doc (referenced by this PR). |
… assignment uniqueness, downstream cleanups Addresses the codex/copilot review threads on PR #283 plus the Codex Security HIGH finding on cross-tenant role bleed. **Schema (000051):** - Replace uq_user_org_role_scope (broken on org-scope NULL scope_id) with two partial unique indexes scoped to live rows: uq_user_org_role_org_scope WHERE scope_type = 'org' AND deleted_at IS NULL uq_user_org_role_site_scope WHERE scope_type = 'site' AND deleted_at IS NULL This closes the NULL-distinct loophole AND unblocks re-assignment after UnassignRole (soft-delete no longer collides). - Per-org role model: add role.organization_id with FK to organization(id) ON DELETE CASCADE. Drop the global uq_role_name from 000002 and replace with two partial unique indexes — built-ins unique per (org, builtin_key), customs unique per (org, name). Editing one org's ADMIN cannot leak into another org's ADMIN. - Add CHECK chk_role_builtin_key_matches_flag enforcing the is_builtin ↔ builtin_key invariant at the DB layer. **Seed (000052):** - Per-org seed: cross-join active orgs with the built-in spec list so every org gets its own SUPER_ADMIN, ADMIN, FIELD_TECH rows. - Repoint existing user_organization.role_id from the legacy global ADMIN/SUPER_ADMIN rows to the per-org replacements (match by name). - Soft-delete the legacy global rows. - Down migration deletes role_permission unconditionally before permission rows (thread 3 — RESTRICT FK fired the other way). **Backfill (000053):** ON CONFLICT clause updated to target the partial index (user_id, organization_id, role_id) WHERE scope_type = 'org' AND deleted_at IS NULL. **Reconciler:** Rewritten to list active orgs and call seedOrgBuiltins for each. Public SeedOrgBuiltins helper exposed so the onboarding flow can seed a brand-new org's built-ins inside its existing creation transaction. Per-org policy matches the plan — SUPER_ADMIN full reconcile, ADMIN/FIELD_TECH additive-only. **Onboarding wiring:** CreateAdminUserWithOrganization stops calling the global UpsertRole; it now calls authz.SeedOrgBuiltins and uses the returned per-org SUPER_ADMIN id for the user's user_organization row. roleName/roleDescription params kept on the interface for backward compatibility but ignored (built-in definitions come from authz/builtin.go). **sqlc queries:** UpsertBuiltinRoleForOrg, GetBuiltinRoleForOrg, ListBuiltinRolesForOrg, ListActiveOrganizationIDs, CreateCustomRole (takes org_id), ListCustomRolesForOrg (filters by org_id), UpdateCustomRoleName and SoftDeleteCustomRole gated on is_builtin = FALSE. **Catalog (thread 5):** CatalogByResource comment corrected — map iteration is not stable. Added ResourceOrder() for callers that need declaration-driven order. **Tests:** - New: per-org-isolation test proves editing org A's ADMIN does not affect org B's ADMIN (the whole point of Option C). - New: TestAssignment_SoftDeletedRowDoesNotBlockReassign and TestAssignment_DuplicateLiveOrgScopeRejected lock in the fixed unique-index behavior. - All existing reconcile/backfill tests rewritten to operate per-org (helpers take an orgID; tests insert an org before calling Reconcile). Resolves codex/copilot review threads on PR #283.
Codex Security Review — responseThanks for the careful read. Both findings are addressed in commit HIGH — Custom roles global across organizations
This was a real cross-tenant bleed risk and went deeper than just custom roles — the built-in Switched the design to per-org built-ins (every role is owned by an organization):
New regression test MEDIUM — Assignment uniqueness does not work for org-scope rows and conflicts with soft delete
Fixed both halves by replacing
New tests:
The backfill |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0e47b16251
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Per project feedback: code comments should document the why directly, not point at plan/PR labels that can rot. Stripped "U4", "U6", "U8", "U11", "PR 2", and similar references from catalog.go, reconcile.go, the sqlc query files, the migrations, and the onboarding hook in user.go; rewrote each affected comment so it stands on its own. Behavior unchanged.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ef9ba2cee8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
… per-org admin lookup + dual-write Addresses round-2 PR review: **Reconciler (Codex Security HIGH #1):** previous implementation re-asserted every seed permission on every boot for ADMIN and FIELD_TECH, which silently restored operator-revoked rows the next time fleetd started. The intent of "additive-only" was operator-edits- survive — actual behavior was "seed-set-is-restored." Fixed by splitting reconciliation: GetBuiltinRoleForOrg first; if the row does not exist, INSERT and seed every permission in the spec; if it does exist, SUPER_ADMIN gets full reconcile (assign + prune), ADMIN and FIELD_TECH get nothing. Once a per-org built-in row exists the operator owns it; new catalog permissions only show up on roles the operator opts into editing. Two new regression tests: TestReconcile_OperatorRevokedSensitivePermStaysRevokedOnRestart and TestReconcile_NewCatalogPermissionDoesNotPropagateToExistingBuiltins. **Composite role FK (Codex Security HIGH #2):** added UNIQUE(id, organization_id) on role and replaced user_organization_role's role_id FK with a composite FK on (role_id, organization_id) → role(id, organization_id). An assignment can no longer reference a role owned by a different organization — DB-enforced tenant isolation. ListEffectivePermissionsForUser, CountOrgScopeSuperAdminsExcludingAssignment, and CountOrgScopeSuperAdminsExcludingUser now also assert r.organization_id = uor.organization_id in their joins as defense in depth. **Per-org ADMIN lookup (Codex thread P1):** auth.Service.CreateUser was calling GetRoleByName(AdminRoleName) which now returns an arbitrary per-org ADMIN row across orgs. Added UserManagementStore.GetBuiltinRoleForOrg(orgID, builtinKey) and the SQL implementation; CreateUser uses it with the user's actual orgID so the new user is bound to their own org's ADMIN row. **Dual-write into user_organization_role (Codex Security MEDIUM):** new users created post-migration only landed in legacy user_organization. The resolver, once it goes live, reads from user_organization_role, which means fresh installs would have a founding SUPER_ADMIN with no effective permissions. Both CreateAdminUserWithOrganization and CreateUserOrganizationRole now dual-write: legacy user_organization plus an org-scope row in user_organization_role. New test TestOnboarding_FoundingUserGetsOrgScopeSuperAdminAssignment locks this in.
Codex Security Review (round 2) — responseAll three findings addressed in HIGH — Editable built-in roles regain revoked permissions on restart
Spot-on. The original wording "additive-only" was misleading — the implementation re-asserted the seed set on every boot, which silently restored anything an operator deliberately revoked. Fixed by switching to seed-on-create: Two regression tests added:
Trade-off: new catalog keys added in future releases no longer auto-propagate to existing orgs' ADMIN/FIELD_TECH. Operators add them via the role editor. That's the correct ownership boundary — "the operator owns this role" cannot also mean "the deploy can quietly widen it." HIGH — Role assignments can reference roles from another organization
Fixed structurally:
MEDIUM — New users are not mirrored into the new assignment table
Both New test
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dcf5224bf1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Migration 000051 drops uq_role_name, so the old UpsertRole's ON CONFLICT (name) was broken at runtime — any caller would fail with "no unique or exclusion constraint matching the ON CONFLICT specification." Codex P2 caught it. Replaced UpsertRole with UpsertCustomRoleForOrg(name, description, organization_id). ON CONFLICT now targets the partial unique index uq_role_org_custom_name WHERE is_builtin = FALSE AND deleted_at IS NULL — the correct conflict surface for per-org custom roles. Built-ins still go through UpsertBuiltinRoleForOrg. Updated callers: - server/internal/testutil/database_queries.go: CreateSuperAdminUser and CreateSuperAdminUser2 now call authz.SeedOrgBuiltins to get the per-org SUPER_ADMIN row (mirroring the real onboarding flow), and dual-write into user_organization_role so test users land in both tables. - server/internal/handlers/apikey/handler_test.go: the VIEWER custom role created for the non-admin auth test uses UpsertCustomRoleForOrg with the test's existing adminUser.OrganizationID; the test user is also dual-written into user_organization_role. Build + lint clean.
Codex caught: SoftDeleteOrganization only marks organization.deleted_at — it doesn't cascade to user_organization. A soft-deleted org can still have active user_organization rows. If 000052 filtered those orgs out of the per-org built-in seed: 1. Step 3 (repoint user_organization.role_id to per-org rows) finds no per-org built-in for the soft-deleted org and leaves the row pointing at the legacy global ADMIN/SUPER_ADMIN. 2. 000053 backfills that row into user_organization_role with role_id = legacy global (organization_id NULL) and organization_id = the soft-deleted org's id. 3. The composite FK fk_user_org_role_role (role_id, organization_id) → role(id, organization_id) rejects the row because the legacy global has organization_id NULL. Deploy fails on any environment with soft-deleted org memberships. Fixed by dropping the org.deleted_at IS NULL filter in the per-org seed; soft-deleted orgs now get their three built-in rows, the repoint succeeds, and the backfill produces FK-valid rows. Three rows per soft-deleted org is bounded and lets undelete "just work." ListActiveOrganizationIDs (used by the boot reconciler) still excludes deleted orgs, so we don't reconcile their permissions on every boot. New test TestSeed_SoftDeletedOrgStillGetsBuiltins inserts a soft-deleted org and asserts SeedOrgBuiltins produces all three per-org rows.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 54336d8a35
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ering Round-3 PR review fixes: **000053 down — hard-delete mirrored assignments (Codex Security HIGH):** Soft-deleted user_organization_role rows still hold the composite FK (ON DELETE RESTRICT) to per-org built-in roles, so the next step in the rollback chain — 000052 down's DELETE on those roles — failed mid-rollback. Switched to hard delete. Removed the redundant deleted_at filter so the down also clears rows produced by dual-write paths during the soak window. **000051 up — reserve built-in names from custom rows (Codex Security MEDIUM):** Existing authorization gates still compare session.Info.Role by string. A custom row named 'ADMIN' or 'SUPER_ADMIN' could be mistaken for the built-in by name-based code paths. Added CHECK chk_role_custom_name_not_reserved rejecting customs with names 'SUPER_ADMIN', 'ADMIN', or 'FIELD_TECH'. **000051 down — handle duplicate custom names (Codex P1 thread):** The partial unique index uq_role_org_custom_name allows two orgs to have customs with the same name. The down's restoration of UNIQUE(name) would then fail with a duplicate-key error and abort mid-rollback. Hard-delete every custom row before restoring the global constraint. PR 1 ships no RPC to create customs so this is a no-op in production today; it makes the down safe forward. **backfill_test.go — insert org before Reconcile (Codex Security MEDIUM):** Tests called authz.Reconcile then insertTestOrganization, but Reconcile only iterates orgs that exist at the time it runs. Reordered so the org is in place before the boot reconcile runs and the per-org built-ins are created. All five affected tests fixed.
Codex Security Review (round 3) — responseAll three findings addressed in HIGH — RBAC rollback is blocked by retained assignment FK references
Fixed: 000053 down now hard-deletes the mirrored assignment rows (using MEDIUM — Custom roles can use built-in role names while authorization still trusts role names
Fixed at the DB layer in 000051 up: ALTER TABLE role
ADD CONSTRAINT chk_role_custom_name_not_reserved CHECK (
is_builtin = TRUE
OR name NOT IN ('SUPER_ADMIN', 'ADMIN', 'FIELD_TECH')
);Application code that creates custom roles will surface this as a clear constraint violation. The structural block holds even if a future migration or a direct DB insert tries to create a colliding name. MEDIUM — New backfill tests create orgs after reconciliation
Fixed: all five affected tests in |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6105c60b75
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…tolerant
The round-3 chk_role_custom_name_not_reserved CHECK and the
uq_role_org_custom_name partial unique index were both case-sensitive.
That left two real gaps:
1. A custom role named "admin" / "Admin" / " ADMIN " passed the CHECK
and didn't collide with the built-in "ADMIN" in the unique index.
The existing name-based gates (`info.Role != "ADMIN"`) wouldn't be
fooled — they're also case-sensitive uppercase — but the role list
would show near-duplicates that read identically, and any future
code change to case-insensitive comparison would reopen the gate
bypass.
2. Within a single org, an operator could create "admin" and "Admin"
as two distinct custom roles. Confusing UX and a soft attack
surface.
Both closed:
- CHECK rewritten as `LOWER(BTRIM(name)) NOT IN
('super_admin', 'admin', 'field_tech')` so case variants and
whitespace-padded forms are rejected at insert time.
- Partial unique index `uq_role_org_custom_name` keyed on
`(organization_id, LOWER(BTRIM(name)))` so case variants collapse
to one row per org. Display capitalization is preserved — the
`name` column stores the original; only the index key is
case-folded.
- `UpsertCustomRoleForOrg`'s ON CONFLICT target updated to match the
new index expression so upserts hit the existing row instead of
failing.
Three new regression tests:
- TestCustomRole_ReservedBuiltinNamesRejectedCaseInsensitively walks
every reasonable case/whitespace variant of the three built-in
names and asserts each is rejected.
- TestCustomRole_CaseInsensitiveUniquenessPerOrg upserts "Floor
Manager" then "FLOOR MANAGER" against the same org and asserts
both calls return the same role id (and the original "Floor
Manager" casing is preserved).
- TestCustomRole_SameNameAcrossOrgsAllowed proves per-org isolation:
two different orgs can each have a "Floor Manager" custom role.
Homoglyph attacks (Cyrillic А for Latin A, etc.) remain out of
scope — a unicode-normalization pass at the API boundary is the
right place for that if it ever becomes a real concern. Documented
in the migration comment.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 81e7ee38cc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…r 000051 down
Round-4 PR review caught two related bugs in my last commit:
**000051 up — CHECK rejected legacy rows on upgrade (Codex HIGH / P0 thread):**
Migration 000002 seeded a global ADMIN row, and onboarding may have
created a SUPER_ADMIN row, both with is_builtin=FALSE and
organization_id NULL. Adding chk_role_custom_name_not_reserved was
making Postgres validate against those rows immediately and aborting
the migration before 000052 could repoint and soft-delete them. New
deployments would also fail since 000002 always seeds ADMIN. Exempt
legacy rows (organization_id IS NULL) from the CHECK — the exemption
is safe because CreateCustomRole always sets organization_id, so an
application caller cannot exploit it.
**000051 down — order error + over-broad role delete (Codex MEDIUM / P1 thread):**
The previous version of the down file dropped role_permission at the
top, then later ran DELETE FROM role_permission — relation does not
exist. It also DELETE'd every is_builtin=FALSE role, which on a real
rollback would include the legacy ADMIN/SUPER_ADMIN rows that 000052's
own down just restored, leaving user_organization.role_id dangling.
Rewrote with explicit steps:
Step 1: custom cleanup (only org-scoped customs; legacy globals
preserved so they can satisfy 000052's restored
user_organization references).
Step 2: drop new tables and assignment indexes.
Step 3: drop role-table columns/constraints/indexes.
Step 4: restore uq_role_name UNIQUE(name).
Behavior unchanged for fresh installs (no customs exist); now safe
under rollback after PR 3 ships CreateCustomRole.
Codex Security Review (round 4) — responseBoth findings addressed in HIGH — RBAC migration rejects existing ADMIN/SUPER_ADMIN rows
Fixed: the CHECK now exempts MEDIUM — Down migration cannot roll back cleanly
Fixed: rewrote the down migration into four explicit steps — custom cleanup first (while the table and the is_builtin/organization_id columns still exist), then table drops, then role-column drops, then Build + lint clean. |
main shipped a new 000051_bound_curtailment_operational_controls migration. Bumping the RBAC migrations to avoid a number collision: 000051_create_permission_tables → 000052 000052_seed_builtin_roles → 000053 000053_backfill_user_organization... → 000054 Inline references in the migration files and in the authz Go package comments updated to match. No behavior change.
# Conflicts: # server/generated/sqlc/db.go
First PR of the four-part granular RBAC plan (#282 — the plan PR, updated alongside this branch as the design evolved).
This is the foundation cut: schema, permission catalog, sqlc bindings, seed migration with per-org built-in roles, startup reconciliation, the user_organization backfill, and the onboarding dual-write. The existing
RequireAdminmiddleware stays in place and continues to be the live authorization path. PR 2 will swap callers to a permission-based gate; until that lands, this PR is purely additive — the new tables are populated and the resolver inputs are ready, but no code path reads them yet.What landed (foundation surface)
Schema — migration 000052 (
server/migrations/000052_create_permission_tables.{up,down}.sql)permission,role_permission, anduser_organization_roletables.ALTER TABLE roleaddsis_builtin,builtin_key, andorganization_id— every role is now owned by an organization (built-in or custom). The legacy globaluq_role_name UNIQUE(name)from migration 000002 is dropped.role(live rows only):uq_role_org_builtin_key— per-org built-in identity, keyed on(organization_id, builtin_key).uq_role_org_custom_name— per-org custom uniqueness keyed on(organization_id, LOWER(BTRIM(name))). Case-insensitive and whitespace-trim-tolerant:"Admin","admin", and" ADMIN "collapse to one row.user_organization_role(NULL-safe, live-row scoped):uq_user_org_role_org_scopeforscope_type = 'org'(omitsscope_idto dodge Postgres's NULL-distinct trap).uq_user_org_role_site_scopeforscope_type = 'site'.deleted_at IS NULLso re-assignment afterUnassignRoleis allowed.user_organization_role(role_id, organization_id)→role(id, organization_id)(UNIQUE (id, organization_id)onroleis the target). An assignment row can no longer reference a role from a different org — DB-enforced tenant isolation.site(id, org_id)for site-scoped assignments,DEFERRABLE INITIALLY DEFERREDfor transactional reassignments.chk_role_builtin_key_matches_flag—is_builtin = TRUE ↔ builtin_key IS NOT NULL.chk_role_custom_name_not_reserved— custom rows cannot be namedSUPER_ADMIN,ADMIN, orFIELD_TECH(case-insensitive). Legacy global rows (organization_id IS NULL) are exempt so existing deployments don't fail the validation on upgrade.chk_user_org_role_scope_id_matches_type— org-scope ⇒scope_id IS NULL; site-scope ⇒scope_id IS NOT NULL.Permission catalog —
server/internal/domain/authz/catalog.go+proto/authz/v1/authz.proto29 permission keys across 10 resource groups (
fleet,miner,rack,site,serverlog,curtailment,fleetnode,apikey,user,role). Source of truth for everything downstream. Tests assert catalog completeness, key shape (^[a-z]+:[a-z_]+$), and resource grouping.sqlc queries (
server/sqlc/queries/)Org-aware throughout — every role-touching query takes
organization_id. New / changed:permission.sql— list/get/upsert catalog + role-permission insert/delete +PrunePermissionsOutsideKeys(used by SUPER_ADMIN full-reconcile only).role.sql—ListBuiltinRolesForOrg,GetBuiltinRoleForOrg,UpsertBuiltinRoleForOrg,ListActiveOrganizationIDs,ListCustomRolesForOrg,CreateCustomRole,UpsertCustomRoleForOrg,UpdateCustomRoleName,SoftDeleteCustomRole. The legacyUpsertRole(name, description)was retargeted toUpsertCustomRoleForOrg(name, description, organization_id)and itsON CONFLICTmatches the new partial index expression.user_organization_role.sql—AssignRole,UnassignRole,ListAssignmentsForUser,ListAssignmentsForRole,ListEffectivePermissionsForUser(single-query resolver source), and the last-org-scope-SUPER_ADMIN guards used by the future delete/deactivate paths.r.organization_id = uor.organization_idas belt-and-suspenders defense in depth.Seed + per-org reconciliation — migration 000053 +
server/internal/domain/authz/{builtin,reconcile}.goorganization × built-in spec(including soft-deleted orgs, since membership doesn't cascade). Existinguser_organizationrows that pointed at the legacy global ADMIN/SUPER_ADMIN rows are repointed to each user's per-org replacement; the legacy globals are then soft-deleted.Reconcile(ctx, db)runs at boot fromcmd/fleetd/main.goinside apg_advisory_xact_lock'd transaction. It iteratesListActiveOrganizationIDsand callsSeedOrgBuiltins(ctx, q, orgID)for each.SUPER_ADMIN— fully reconciled toAllPermissions()per org. Tampering is repaired on every boot.ADMIN/FIELD_TECH— seeded once on creation; left alone after that. When a per-org row is first inserted, every permission in the seed formula is attached. Subsequent boots leave itsrole_permissionset untouched. The operator owns the role after creation. Re-asserting the seed set every boot would have silently restored permissions the operator deliberately revoked (e.g.,miner:update_pools,miner:firmware_update) — exactly the bug Codex caught in round 1.SeedOrgBuiltinsis also called from the onboarding flow (see below) so a brand-new org gets its built-ins in the same transaction as org creation.Backfill — migration 000054 + onboarding dual-write
user_organizationrow intouser_organization_roleas an org-scope assignment, after 000053 has repointed legacyrole_idreferences to per-org rows. Soft-deleted user_organization rows are skipped.CreateAdminUserWithOrganization(founding user, insqlstores/user.go) andCreateUserOrganizationRole(subsequent users) now write bothuser_organizationANDuser_organization_roleinside the same transaction. Without this, fresh installs would have a founding SUPER_ADMIN with no row in the assignment table, and the resolver would deny everything once PR 2 starts consuming it.auth.Service.CreateUserswapped fromGetRoleByName("ADMIN")(which would return an arbitrary per-org ADMIN row across orgs) toGetBuiltinRoleForOrg(orgID, AdminRoleName).Plan deviation (still in force)
The plan's U5 also renamed
user_organization.role_idtorole_id_deprecated_do_not_useand added a raising trigger on non-NULL writes. That part is moved to PR 2's opening migration so it lands together with the caller swap. Doing it in PR 1 would break the existing onboarding flow the moment this deploys. The plan's intent — loud failure during the soak window — is preserved; the soak boundary just shifts from "between PR 1 and U12" to "between PR 2 and U12". Noted inline in000054_backfill_user_organization_to_assignments.up.sql.What this PR does NOT do
RequirePermissionmiddleware or resolver — PR 2.RequireAdmincallers — PR 2.user_organization.role_idor add the raising trigger — PR 2.UserInfo.permissions— PR 3.user_organization.role_idcolumn — PR 4.Review history baked in
Four rounds of automated review (Codex Security + Codex P-tier + Copilot) drove substantial design changes captured above. Headline outcomes:
UNIQUE (… , scope_id)with two partial unique indexes; soft-delete no longer blocks re-assignment.user_organization_roletorole(id, organization_id)makes a cross-org pointer structurally impossible.chk_role_custom_name_not_reservedis case-insensitive and whitespace-trim-tolerant; the partial unique index for customs is case-folded so"Admin"/"admin"collapse to one row per org.user_organization.role_iddoesn't dangle.All review threads from the four rounds are resolved on the PR.
Test plan
go build ./...cleango vet ./...cleanlefthook run pre-commitandlefthook run pre-push(golangci-lint + client-typecheck) both greenjust genclean — sqlc statically parses the migrations and produces compiling bindings; proto generation producesauthz/v1/authz_pb.tsandauthz/v1/authz.pb.gogo test ./internal/domain/authz/... -run 'TestAllPermissions|TestCatalog|TestLookup|TestResourceOrder'just devto bring up Postgres, thengo test ./internal/domain/authz/.... Covers: fresh-install seeding (TestReconcile_FreshInstall_*), idempotency, per-org isolation (TestReconcile_PerOrgIsolation_*), seed-on-create semantics (TestReconcile_OperatorRevokedSensitivePermStaysRevokedOnRestart,TestReconcile_NewCatalogPermissionDoesNotPropagateToExistingBuiltins), SUPER_ADMIN tamper repair, NULL-safe assignment uniqueness (TestAssignment_DuplicateLiveOrgScopeRejected,TestAssignment_SoftDeletedRowDoesNotBlockReassign), backfill behavior, soft-deleted-org seeding (TestSeed_SoftDeletedOrgStillGetsBuiltins), onboarding dual-write (TestOnboarding_FoundingUserGetsOrgScopeSuperAdminAssignment), reserved built-in names case-insensitive (TestCustomRole_ReservedBuiltinNamesRejectedCaseInsensitively), case-insensitive per-org custom uniqueness (TestCustomRole_CaseInsensitiveUniquenessPerOrg), and same-name-across-orgs-allowed (TestCustomRole_SameNameAcrossOrgsAllowed).just devapplies 000052–000054 cleanly;psqlconfirms per-org built-in role rows exist with the expected permission counts and all CHECK / FK constraints are in place.Rollback
Each migration has a paired
*.down.sql. The reconciler is invoked once at boot; reverting this PR removes that call. Down migrations were rewritten across the review rounds to be order-correct (custom cleanup before any DROPs, table-existence dependencies respected) and to preserve the legacy global ADMIN/SUPER_ADMIN rows that downstream rollback steps reference. 000054 hard-deletes the mirrored assignment rows it created instead of soft-deleting, so the role-table down-DELETE that follows succeeds against theON DELETE RESTRICTcomposite FK.