Skip to content

[Security] Replace cross-repo PAT with OIDC + GitHub App for Azure deployment #1

@morph-eos

Description

@morph-eos

Context

The Locus deployment pipeline currently uses a long-lived Personal Access Token
(INFRA_REPO_PAT) stored in git-locus/.github to:

  1. Push a sync bundle to git-locus/docker2azure4student (branch sync/<run>-<sha>).
  2. Trigger Workflow B in docker2azure4student which runs Terraform and deploys
    the new image onto the Azure VM.

The same PAT also has repo:write on api, client, and .github so it can
trigger and read repository_dispatch events.

Problem

A leaked PAT means:

  • Full write access to all four repos (code injection, force-push, branch deletion).
  • Ability to push arbitrary container images and trigger deploys to production.
  • Long-lived credential with no automatic rotation; revocation is manual.
  • Authentication is tied to an individual user account (bus factor + when that
    user leaves the org the PAT silently keeps working until manually revoked).

Proposed solution

Replace the PAT with two distinct mechanisms, each scoped to its purpose:

1. Cross-repo dispatch / push between repos in git-locus

Use a GitHub App (locus-deploy-bot) installed on the four repos with the
minimum required permissions:

  • contents:write (only on docker2azure4student)
  • actions:write (to dispatch repository_dispatch events on .github)
  • metadata:read

In each workflow that needs cross-repo access, mint a short-lived installation
token at job start, e.g.:

- uses: actions/create-github-app-token@v1
  id: app-token
  with:
    app-id: ${{ vars.LOCUS_DEPLOY_APP_ID }}
    private-key: ${{ secrets.LOCUS_DEPLOY_APP_PRIVATE_KEY }}
    owner: git-locus
    repositories: docker2azure4student

Then use ${{ steps.app-token.outputs.token }} instead of INFRA_REPO_PAT.

Benefits:

  • Token expires in 1h, generated per-job, never stored long-lived in env.
  • Permissions per-repo and per-resource (least privilege).
  • Audit log shows the App as actor, not a human.
  • Rotation = swap the App private key, no need to touch every workflow.

2. Azure authentication from Workflow B → Terraform

Replace the Azure service principal client secret currently consumed by
azurerm provider with OIDC federated credentials:

  1. In Azure Entra ID, on the existing service principal:
    • Add a Federated Credential of type "GitHub Actions deploying Azure resources"
    • Subject: repo:git-locus/docker2azure4student:ref:refs/heads/main
      (one credential per branch/environment that should be allowed to deploy).
  2. In Workflow B:
permissions:
  id-token: write
  contents: read

steps:
  - uses: azure/login@v2
    with:
      client-id:       ${{ vars.AZURE_CLIENT_ID }}
      tenant-id:       ${{ vars.AZURE_TENANT_ID }}
      subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}
  1. Remove AZURE_CLIENT_SECRET from the repo Secrets and from the SP.

Benefits:

  • No long-lived Azure credential anywhere.
  • Federated trust scoped to a specific repo + branch + workflow event.
  • Terraform inherits the OIDC token automatically (azurerm provider
    understands ARM_USE_OIDC=true).

Acceptance criteria

  • LOCUS_DEPLOY_APP_ID and LOCUS_DEPLOY_APP_PRIVATE_KEY configured in
    .github, api, client.
  • All workflows replace secrets.DEPLOY_PAT / secrets.INFRA_REPO_PAT
    with installation tokens minted via actions/create-github-app-token.
  • INFRA_REPO_PAT and DEPLOY_PAT deleted from all repo Secrets.
  • Federated Credential created on the Azure SP scoped to
    git-locus/docker2azure4student ref:refs/heads/main.
  • Workflow B uses azure/login@v2 with id-token: write + OIDC; no
    client-secret argument.
  • AZURE_CLIENT_SECRET removed from docker2azure4student Secrets and
    reset on the SP.
  • Documented in .github/README.md and knowledge/deploy/.
  • One end-to-end deploy run executed successfully after the change.

Out of scope

  • Migrating from Azure SP to Managed Identity on the VM (separate work).
  • Rotating Postgres / GHCR credentials (separate hygiene pass).

Notes for the implementer

  • Both changes can be rolled out incrementally: introduce the App / OIDC
    alongside the PAT, switch one workflow at a time, then delete the PAT
    once nothing references it (grep -r "INFRA_REPO_PAT\|DEPLOY_PAT" returns
    empty across the four repos).
  • The GitHub App private key should itself be stored only in .github and
    in api/client Secrets (never committed). Consider rotating it every
    6–12 months.

Severity

High — long-lived broad-scope PAT is the single biggest credential exposure
left in the deployment pipeline after the round-1 hardening.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions