feat(AGX1-274): record task creator identity and FGAC migration safety#246
Draft
asherfink wants to merge 2 commits into
Draft
feat(AGX1-274): record task creator identity and FGAC migration safety#246asherfink wants to merge 2 commits into
asherfink wants to merge 2 commits into
Conversation
13fe4b2 to
7486e5a
Compare
5 tasks
7486e5a to
b9cb26b
Compare
4 tasks
…n creation Adds two nullable creator-audit columns to the tasks table — creator_user_id and creator_service_account_id — populated from the principal context at create time. A CHECK constraint (ck_tasks_one_creator) enforces that at most one is set. This replaces the earlier dual-write draft: grants are already issued unconditionally via grant_with_retry in agents_acp_use_case.py:239, and per-account rollout routing belongs in agentex-auth (private), not in this public Apache-2.0 codebase.
ad1e980 to
3a06be8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related work
Parent epic: AGX1-264 — per-task FGAC. Follow-ups bundled in AGX1-291.
This change is part of a 5-PR stack across 3 repos. Merge order: scaleapi/scaleapi#144783 (release sgp-authz 0.7.1) → scaleapi/agentex#353 → scaleapi/agentex#356 → this PR → #249.
Action.CANCELcancelopregister_resourceAPI + cancel cleanupTwo commits — keep them separate during review, the audit-column schema change is independent of the dual-write call sites.
Summary
Commit 1 — passive audit columns:
creator_user_id/creator_service_account_idcolumns to thetaskstable, populated from the request principal onAgentTaskService.create_task. Best-effort (NULLable; see caveat below).CHECK ((creator_user_id IS NULL) OR (creator_service_account_id IS NULL))to enforce at-most-one creator type at the DB layer (constraint name:ck_tasks_at_most_one_creator).ix_tasks_creator_user_idandix_tasks_creator_service_account_id(CREATE INDEX CONCURRENTLY) for future "tasks created by X" lookups.Commit 2 — FGAC dual-write call sites + flag:
FGAC_TASKS_DUAL_WRITEenv-var flag, injected intoAgentTaskServicevia FastAPI DI. Gates the dual-write behavior end-to-end.create_taskcallsregister_resource(task, parent_resource=agent)on the authorization service after the Postgres row is persisted, so the task is registered withtenant + owner + parent_agenttuples atomically (via scaleapi/agentex#356's new endpoint).delete_taskcallsderegister_resource(task)after the Postgres delete. Pre-resolves the task id by name first so the post-delete deregister doesn't race the lookup._dual_write_with_retry(op_name, do_call, task_id)helper. RetriesAuthenticationServiceUnavailableError/AuthenticationGatewayErrorwith exponential backoff + jitter (3 retries → 4 total attempts max), mirroringAgentsACPUseCase.grant_with_retry. Non-transient exceptions are not retried.task_fgac_dual_write.attempt|success|retry|failure) tagged withop:register|deregisterandexception_class:<name>on failure — these are the rollout signal for AGX1-291's operator runbook.Migration safety
ALTER TABLE ... ADD CONSTRAINT ... NOT VALID+ALTER TABLE ... VALIDATE CONSTRAINT— splits the operation so the briefACCESS EXCLUSIVElock doesn't have to wait on an existence scan.tasksis high-write; a CHECK addition withoutNOT VALIDwould queue behind in-flight transactions and block readers until released.CONCURRENTLYin anautocommit_block.a1f73ada66c5(add_task_creator_columns).down_revisionis6c942325c828(adding_task_cleaned_at, the current alembic head on main);migration_history.txtregenerated viaalembic history. The ORM-sideCheckConstraintinorm.pymatches the DB-side (same constraint name + predicate).Rollout
register_resourceandderegister_resourcefire on create/delete. If they fail after retries, the Postgres row is still the durable record — orphan auth tuples can be cleaned up out of band per the AGX1-291 operator runbook using the creator-audit columns to identify them.Audit-trail caveat
Creator attribution is best-effort: tasks created outside an HTTP request context (Temporal activities, background workers, any path that constructs
AgentTaskServicewithoutrequest.state.principal_context) leave both columns NULL. The CHECK constraint allows both-NULL, andtest_no_resolvable_creator_leaves_both_columns_nullexercises this path.What changed
database/migrations/alembic/versions/2026_05_21_1508_add_task_creator_columns_a1f73ada66c5.py(new): NOT VALID-pattern migration.down_revision = "6c942325c828".src/adapters/orm.py: declarativeCheckConstraintmirroring the DB constraint.src/domain/entities/tasks.py: new optional fields onTaskEntity.src/domain/services/task_service.py:_principal_fieldhelper (handles dict-vs-pydantic principal shape from the authn proxy).create_taskreadscreator_user_id/creator_service_account_idfrom principal context.AgentTaskService.__init__takesdual_write_enabled: DEnvironmentVariable(EnvVarKeys.FGAC_TASKS_DUAL_WRITE)._dual_write_with_retry(op_name, do_call, task_id)keyed by op name; reused from both call sites.src/adapters/authorization/adapter_agentex_authz_proxy.py: forwards to agentex-auth's/v1/authz/registerand/deregister.src/config/environment_variables.py: newFGAC_TASKS_DUAL_WRITEkey.test_task_audit_columns.py— testcontainers Postgres integration tests for the audit columns (creator population, mutual-exclusion CHECK, both-NULL allowed).test_task_fgac_dual_write.py— covers register-on-create, deregister-on-delete, flag-off skip, transient retry-and-succeed (both register and deregister sides), retry exhaustion propagating with the Postgres row preserved, and the name-route ItemDoesNotExist swallow.dual_write_enabledconstructor parameter.Test plan
migration_lint.py— clean.test_task_audit_columns.py— 7/7 pass locally via testcontainers.test_task_fgac_dual_write.py— collects cleanly; runs in CI integration suite.\d tasksshows new columns + constraint + indexes; flip flag on for one account, confirmtask_fgac_dual_write.successfires.