Context
The Locus deployment pipeline currently uses a long-lived Personal Access Token
(INFRA_REPO_PAT) stored in git-locus/.github to:
- Push a sync bundle to
git-locus/docker2azure4student (branch sync/<run>-<sha>).
- Trigger Workflow B in
docker2azure4student which runs Terraform and deploys
the new image onto the Azure VM.
The same PAT also has repo:write on api, client, and .github so it can
trigger and read repository_dispatch events.
Problem
A leaked PAT means:
- Full write access to all four repos (code injection, force-push, branch deletion).
- Ability to push arbitrary container images and trigger deploys to production.
- Long-lived credential with no automatic rotation; revocation is manual.
- Authentication is tied to an individual user account (bus factor + when that
user leaves the org the PAT silently keeps working until manually revoked).
Proposed solution
Replace the PAT with two distinct mechanisms, each scoped to its purpose:
1. Cross-repo dispatch / push between repos in git-locus
Use a GitHub App (locus-deploy-bot) installed on the four repos with the
minimum required permissions:
contents:write (only on docker2azure4student)
actions:write (to dispatch repository_dispatch events on .github)
metadata:read
In each workflow that needs cross-repo access, mint a short-lived installation
token at job start, e.g.:
- uses: actions/create-github-app-token@v1
id: app-token
with:
app-id: ${{ vars.LOCUS_DEPLOY_APP_ID }}
private-key: ${{ secrets.LOCUS_DEPLOY_APP_PRIVATE_KEY }}
owner: git-locus
repositories: docker2azure4student
Then use ${{ steps.app-token.outputs.token }} instead of INFRA_REPO_PAT.
Benefits:
- Token expires in 1h, generated per-job, never stored long-lived in env.
- Permissions per-repo and per-resource (least privilege).
- Audit log shows the App as actor, not a human.
- Rotation = swap the App private key, no need to touch every workflow.
2. Azure authentication from Workflow B → Terraform
Replace the Azure service principal client secret currently consumed by
azurerm provider with OIDC federated credentials:
- In Azure Entra ID, on the existing service principal:
- Add a Federated Credential of type "GitHub Actions deploying Azure resources"
- Subject:
repo:git-locus/docker2azure4student:ref:refs/heads/main
(one credential per branch/environment that should be allowed to deploy).
- In Workflow B:
permissions:
id-token: write
contents: read
steps:
- uses: azure/login@v2
with:
client-id: ${{ vars.AZURE_CLIENT_ID }}
tenant-id: ${{ vars.AZURE_TENANT_ID }}
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}
- Remove
AZURE_CLIENT_SECRET from the repo Secrets and from the SP.
Benefits:
- No long-lived Azure credential anywhere.
- Federated trust scoped to a specific repo + branch + workflow event.
- Terraform inherits the OIDC token automatically (
azurerm provider
understands ARM_USE_OIDC=true).
Acceptance criteria
Out of scope
- Migrating from Azure SP to Managed Identity on the VM (separate work).
- Rotating Postgres / GHCR credentials (separate hygiene pass).
Notes for the implementer
- Both changes can be rolled out incrementally: introduce the App / OIDC
alongside the PAT, switch one workflow at a time, then delete the PAT
once nothing references it (grep -r "INFRA_REPO_PAT\|DEPLOY_PAT" returns
empty across the four repos).
- The GitHub App private key should itself be stored only in
.github and
in api/client Secrets (never committed). Consider rotating it every
6–12 months.
Severity
High — long-lived broad-scope PAT is the single biggest credential exposure
left in the deployment pipeline after the round-1 hardening.
Context
The Locus deployment pipeline currently uses a long-lived Personal Access Token
(
INFRA_REPO_PAT) stored ingit-locus/.githubto:git-locus/docker2azure4student(branchsync/<run>-<sha>).docker2azure4studentwhich runs Terraform and deploysthe new image onto the Azure VM.
The same PAT also has
repo:writeonapi,client, and.githubso it cantrigger and read repository_dispatch events.
Problem
A leaked PAT means:
user leaves the org the PAT silently keeps working until manually revoked).
Proposed solution
Replace the PAT with two distinct mechanisms, each scoped to its purpose:
1. Cross-repo dispatch / push between repos in
git-locusUse a GitHub App (
locus-deploy-bot) installed on the four repos with theminimum required permissions:
contents:write(only ondocker2azure4student)actions:write(to dispatchrepository_dispatchevents on.github)metadata:readIn each workflow that needs cross-repo access, mint a short-lived installation
token at job start, e.g.:
Then use
${{ steps.app-token.outputs.token }}instead ofINFRA_REPO_PAT.Benefits:
2. Azure authentication from Workflow B → Terraform
Replace the Azure service principal client secret currently consumed by
azurermprovider with OIDC federated credentials:repo:git-locus/docker2azure4student:ref:refs/heads/main(one credential per branch/environment that should be allowed to deploy).
AZURE_CLIENT_SECRETfrom the repo Secrets and from the SP.Benefits:
azurermproviderunderstands
ARM_USE_OIDC=true).Acceptance criteria
LOCUS_DEPLOY_APP_IDandLOCUS_DEPLOY_APP_PRIVATE_KEYconfigured in.github,api,client.secrets.DEPLOY_PAT/secrets.INFRA_REPO_PATwith installation tokens minted via
actions/create-github-app-token.INFRA_REPO_PATandDEPLOY_PATdeleted from all repo Secrets.git-locus/docker2azure4studentref:refs/heads/main.azure/login@v2withid-token: write+ OIDC; noclient-secretargument.AZURE_CLIENT_SECRETremoved fromdocker2azure4studentSecrets andreset on the SP.
.github/README.mdandknowledge/deploy/.Out of scope
Notes for the implementer
alongside the PAT, switch one workflow at a time, then delete the PAT
once nothing references it (
grep -r "INFRA_REPO_PAT\|DEPLOY_PAT"returnsempty across the four repos).
.githubandin
api/clientSecrets (never committed). Consider rotating it every6–12 months.
Severity
High — long-lived broad-scope PAT is the single biggest credential exposure
left in the deployment pipeline after the round-1 hardening.