diff --git a/docs/solutions/best-practices/discord-slash-command-orchestration-patterns-2026-05-27.md b/docs/solutions/best-practices/discord-slash-command-orchestration-patterns-2026-05-27.md new file mode 100644 index 00000000..6d3c56cf --- /dev/null +++ b/docs/solutions/best-practices/discord-slash-command-orchestration-patterns-2026-05-27.md @@ -0,0 +1,215 @@ +--- +title: Discord slash command orchestration — permission checks, bootstrap wiring, and token safety +date: 2026-05-27 +category: best-practices +module: gateway +problem_type: best_practice +component: assistant +severity: medium +applies_when: + - A discord.js guild interaction checks bot permissions without the GuildMembers privileged intent + - A command uses dependency injection via a factory but real dispatch/bootstrap wiring is not integration-tested + - A short-lived credential (installation access token) crosses a process or HTTP boundary + - A multi-phase command can partially fail and needs documented recovery + - A find-or-create loop runs against a cache that concurrent operations mutate +tags: + - discordjs + - app-permissions + - dependency-injection + - bootstrap-wiring + - integration-tests + - installation-access-token + - multi-phase-orchestration + - cache-race + - gateway +--- + +# Discord slash command orchestration — permission checks, bootstrap wiring, and token safety + +## Context + +Building the first user-facing gateway feature — the `/fro-bot add-project` Discord slash command — surfaced several non-obvious traps. The command orchestrates four subsystems: GitHub App authentication, a workspace clone over HTTP, Discord channel creation, and an S3 binding write. Each integration boundary hid a failure mode that the initial implementation and its unit tests missed, and that only surfaced under architecture review (Oracle) and automated review (Fro Bot). + +These are reusable patterns for any agent-driven Discord command, any factory-injected feature, and any service-to-service flow carrying a credential. + +## Guidance + +### 1. Discord permission checks: use `interaction.appPermissions`, not `guild.members.cache.get()` + +This is the highest-value learning. A permission check that resolves the bot's own `GuildMember` from the cache returns `undefined` whenever the bot runs **without** the privileged `GuildMembers` intent — which is the default, security-conscious posture. The result is a false-negative: the check fails even when the bot genuinely holds the permission, and the command aborts on a correctly-configured bot. + +**Broken pattern:** + +```ts +function botHasRequiredPermissions(guild: Guild, botUserId: string): boolean { + const member = guild.members.cache.get(botUserId) // undefined without GuildMembers intent + if (member === undefined) return false // false-negative abort + return member.permissions.has(PermissionFlagsBits.ManageChannels) +} +``` + +**Correct pattern** — `interaction.appPermissions` is a `PermissionsBitField | null` that Discord populates reliably for every guild interaction, independent of gateway intents: + +```ts +function botHasRequiredPermissions(appPermissions: PermissionsBitField | null): boolean { + if (appPermissions === null) return false // DM interaction — command is guild-only + return ( + appPermissions.has(PermissionFlagsBits.ManageChannels) && + appPermissions.has(PermissionFlagsBits.SendMessages) + ) +} + +// Call site: +if (botHasRequiredPermissions(interaction.appPermissions) === false) { + await interaction.editReply({ + content: `fro-bot needs **Manage Channels** and **Send Messages**. Re-invite at: ${installUrl}`, + }) + return +} +``` + +### 2. Test the real dispatch path, not just the handler + +The command was refactored to receive its dependencies through a factory (`createFroBotCommand(deps)`), but the bootstrap in `program.ts` never constructed and passed those deps. Every test still passed — because the tests called the orchestration function (`executeAddProject`) **directly**, bypassing the Discord dispatch path entirely. The wiring gap would have shipped broken-in-production code with a green suite. An architecture reviewer caught it, not the tests. + +The fix is an integration-level test that builds the **real** registry and dispatches a **real** interaction: + +```ts +export function createFroBotCommand(deps: AddProjectDeps): SlashCommand { + const execute = (interaction: ChatInputCommandInteraction): Effect.Effect => { + const subcommand = interaction.options.getSubcommand(true) + if (subcommand === 'add-project') return executeAddProject(interaction, deps) + return Effect.fail(new Error(`Unknown subcommand: ${subcommand}`)) + } + return {data, execute} +} + +// Test the dispatch seam, not just executeAddProject: +const registry = getCommandRegistry(makeMockDeps()) +await Effect.runPromise(dispatchCommand(mockInteraction, registry)) +expect(mockDeps.workspaceClient.clone).toHaveBeenCalled() +``` + +`getCommandRegistry` and `dispatchCommand` live in `packages/gateway/src/discord/commands/index.ts`; the factory and bootstrap wiring are in `commands/fro-bot.ts` and `program.ts` respectively. + +### 3. Installation access token (IAT) handling across an HTTP boundary + +The IAT (`ghs_*`, ~1hr lifetime) flows from GitHub App auth into an HTTP request body to the workspace service. Three rules emerged: + +- **Never log request or response body**, even on retry or error. The body carries the token. Mark it as a module invariant. +- **Captured-logger test** asserting no `ghs_` string appears in any log line, iterating **all** error paths (timeout, network, http, parse, response-mismatch, clone-error, malformed-success) — not just the happy path. +- **Response-path integrity check** — the response must correspond to the requested `owner/repo`, defending against a misbehaving or compromised service. +- **`AbortSignal.timeout`** on the fetch (native fetch has no default timeout). + +```ts +// SECURITY INVARIANT: never log request/response body — it carries the IAT. +const response = await fetch(`${baseUrl}/clone`, { + method: 'POST', + headers: {'Content-Type': 'application/json'}, + body, // contains token + signal: AbortSignal.timeout(timeoutMs), +}) + +// Compare the full path, not just the suffix. owner/repo are already +// lowercased upstream, so the expected path is canonical; do not lowercase +// the response path or a case-variant root could slip through. +const expectedPath = `${EXPECTED_WORKSPACE_ROOT}/${owner}/${repo}` +if (parsed.path !== expectedPath) { + return err({kind: 'response-mismatch'}) +} +``` + +A suffix-only check is the weaker form to avoid — `parsed.path.endsWith('/owner/repo')` accepts adversarial prefixes like `/etc/passwd/owner/repo`, letting a misbehaving agent bind a repo to an arbitrary filesystem location. Validate against the full expected path rooted at the known workspace root. + +```ts +// Captured-logger security test — iterate ALL error paths: +for (const line of spy.lines) { + expect(line).not.toContain('ghs_') +} +``` + +### 4. Multi-phase orchestration with idempotent self-healing recovery + +The flow is a five-phase state machine: `PRE_FLIGHT → CLONING → CREATING_CHANNEL → WRITING_BINDING → READY`. The primary recovery path for the clone-exists/no-binding partial-failure shape is **re-running the command** — `/add-project` is idempotent for this case: + +- **Atomic clone guarantee**: the workspace-agent renames the temp dir to the final dest atomically. `repo-exists` from a clone call means the clone is fully complete — not partial. +- **Resume keyed on confirmed absent binding**: when `repo-exists` is returned, the handler re-reads the bindings store. It only resumes (falls through to `CREATING_CHANNEL`) when the store confirms `data === null`. A store error (thrown or `success === false`) returns an internal-error reply and does **not** resume — resuming on an unreadable store would risk orphan channels if the binding write also fails. +- **A defensive permission re-check at the side-effect boundary** (`CREATING_CHANNEL`) — permissions can change between `PRE_FLIGHT` and the first mutation. +- **Preserve expensive prior work** — a failure after `CLONING` keeps the clone on disk; retrying the command will detect `repo-exists`, confirm no binding, and resume from `CREATING_CHANNEL`. + +Accepted v1 fallout and boundaries: + +- **Orphan channel on concurrent resume**: if two invocations both resume simultaneously, both create a channel, then both race to write the binding. The loser's `createBinding` is rejected with `BINDING_EXISTS_ERROR` (atomic IfNoneMatch write). The loser gets a "bound by a concurrent request / manual cleanup may be needed" reply. The orphan channel is a bounded, operator-visible artifact — not a silent data integrity issue. +- **Shutdown gating**: new invocations are refused during drain (after `deferReply` so Discord gets its mandatory ack). In-flight runs that are hard-cut during a process restart are healed by the next retry via the resume path. + +```ts +// Resume only on CONFIRMED absent binding — store errors do NOT resume: +let existing: Awaited> +try { + existing = await bindingsStore.getBindingByRepo(owner, repo) +} catch { + await interaction.editReply({content: 'Internal error checking existing bindings. Please retry in a moment.'}) + return +} +if (existing.success === false) { + await interaction.editReply({content: 'Internal error checking existing bindings. Please retry in a moment.'}) + return +} +if (existing.data !== null) { + // Already bound — redirect. + await interaction.editReply({content: `\`${owner}/${repo}\` is already set up in <#${existing.data.channelId}>.`}) + return +} +// Confirmed absent binding — resume from CREATING_CHANNEL. +workspacePath = workspaceRepoPath(owner, repo) +// fall through — do NOT return + +// Partial-write recovery names both keys for manual cleanup: +if (error.code === 'BINDING_PARTIAL_WRITE_ERROR') { + await interaction.editReply({ + content: `Partial write: primary written, index failed. Manual S3 cleanup:\n- Primary: \`${error.primaryKey}\`\n- Index: \`${error.indexKey}\``, + }) +} +``` + +### 5. Re-read live caches inside find-or-create collision loops + +Snapshotting `guild.channels.cache` once and iterating suffix candidates (`name`, `name-2`, …) against the frozen snapshot is a race. discord.js runs in a single Node process, so this is not OS-level concurrency — it's inter-await interleaving: between two `await` calls in the same event loop, the cache can be mutated by an incoming `channelCreate` gateway event or by another in-flight interaction. Two in-flight interactions interleaved across awaits can see the same snapshot, both attempt to create the same name, and the loser's Discord 50035 (duplicate name) rejection gets silently swallowed as a "transient error" — the binding ends up pointing to the wrong suffixed channel with no operator-visible warning. + +```ts +for (const candidate of candidates) { + // Re-read the LIVE cache each iteration — a concurrent setup may have + // created this candidate since the loop began. + const existing = guild.channels.cache.find( + c => c.name.toLowerCase() === candidate.toLowerCase() && c.type === ChannelType.GuildText, + ) + if (existing !== undefined) continue + const result = await tryCreate(guild, candidate) + // ... +} +``` + +Pair this with a `tryCreate` that distinguishes **name-taken** (advance to next suffix) from **permission-denied** (short-circuit) from **create-failed / rate-limited** (surface a distinct error instead of burning through all suffixes). + +## Why This Matters + +| Trap | Cost if unknown | +| --- | --- | +| Wrong permission source | False-negative abort makes the command unusable on a correctly-configured bot | +| Handler-only tests | Green CI ships broken-in-production bootstrap wiring | +| IAT leakage | Security incident — a short-lived credential on disk/in logs | +| No partial-failure recovery | Operators stranded mid-flow with orphaned channels and clones | +| Cache-snapshot race | Silently binds to the wrong channel under concurrent use | + +## When to Apply + +- Any discord.js slash command that checks bot permissions +- Any feature whose dependencies are injected via a factory and wired at bootstrap +- Any service-to-service flow carrying a short-lived credential +- Any multi-step orchestration with persistent side effects +- Any find-or-create loop against a cache that concurrent operations mutate + +## Related + +- `docs/solutions/code-quality/architectural-issues-type-safety-and-resource-cleanup.md` — adjacent: deterministic recovery and `finally`-style cleanup orchestration (different subsystem, same discipline of making partial-failure paths explicit) +- GitHub issue #646 — Gateway intent-posture flip (privileged intents opt-in), the posture that makes the `appPermissions` trap (learning #1) reachable diff --git a/packages/gateway/AGENTS.md b/packages/gateway/AGENTS.md index 152be6f0..dd4a6b69 100644 --- a/packages/gateway/AGENTS.md +++ b/packages/gateway/AGENTS.md @@ -74,6 +74,17 @@ Existing deployments that need the privileged set must set this on the next deploy. The allowlist is intentionally narrow — operators cannot enable arbitrary Discord intents via this knob. +## Known limitations (v1) + +- **`add-project` is Discord-only.** The orchestration runs inside the slash + command handler and requires a `ChatInputCommandInteraction`; there is no + programmatic surface (HTTP endpoint, CLI, or agent tool) that triggers the same + outcome. An autonomous agent cannot bind a repo without going through Discord. + Recovery is via idempotent retry — re-running the command resumes a partial + setup — rather than agent-callable recovery primitives. Extracting a + Discord-independent `addProject(request, deps)` primitive is deferred until a + non-Discord caller exists. + ## Build ```bash diff --git a/packages/gateway/src/discord/commands/add-project.test.ts b/packages/gateway/src/discord/commands/add-project.test.ts index 34d82f8f..88cff80d 100644 --- a/packages/gateway/src/discord/commands/add-project.test.ts +++ b/packages/gateway/src/discord/commands/add-project.test.ts @@ -15,6 +15,7 @@ import {err, ok} from '@fro-bot/runtime' import {Effect} from 'effect' import {afterEach, beforeEach, describe, expect, it, vi} from 'vitest' import {AppNotInstalledError} from '../../github/app-client.js' +import {__resetShuttingDownForTests} from '../../shutdown.js' import {executeAddProject} from './add-project.js' // --------------------------------------------------------------------------- @@ -41,6 +42,11 @@ function lastReplyContent(reply: ReturnType): string { return calls.at(-1)?.[0]?.content ?? '' } +/** Collect all editReply content strings across every call. */ +function allEditReplies(editReply: ReturnType): string[] { + return (editReply.mock.calls as [{content?: string}][]).map(c => c[0]?.content ?? '') +} + interface MockChannel { readonly channel: TextChannel readonly send: ReturnType @@ -62,7 +68,9 @@ function makeGuild( has: vi.fn().mockReturnValue(hasPermissions), } const member = {permissions} - const members = {cache: {get: vi.fn().mockReturnValue(member)}} + const members = { + fetch: vi.fn().mockResolvedValue(member), + } const channelCache = { find: (pred: (ch: TextChannel) => boolean) => channels.find(pred), @@ -400,7 +408,6 @@ describe('executeAddProject', () => { {code: 'clone-failed', expectedFragment: 'clone-failed'}, {code: 'disk-full', expectedFragment: 'out of space'}, {code: 'enospc', expectedFragment: 'out of space'}, - {code: 'repo-exists', expectedFragment: 'already exists'}, {code: 'head-resolution-failed', expectedFragment: 'head-resolution-failed'}, {code: 'clone-timeout', expectedFragment: 'clone-timeout'}, ] @@ -461,7 +468,10 @@ describe('executeAddProject', () => { filter: () => ({find: () => undefined, some: () => false}), } const create = vi.fn().mockResolvedValue(makeTextChannel().channel) - const guild = {channels: {cache: channelCache, create}} as unknown as Guild + const guild = { + members: {fetch: vi.fn().mockResolvedValue({permissions: {has: vi.fn().mockReturnValue(true)}})}, + channels: {cache: channelCache, create}, + } as unknown as Guild const {interaction, editReply} = makeInteraction({guild, userId, appPermissions}) const deps = makeDeps() @@ -488,6 +498,7 @@ describe('executeAddProject', () => { const nameTakenError = Object.assign(new Error('Invalid Form Body'), {code: 50035}) const create = vi.fn().mockRejectedValue(nameTakenError) const guild = { + members: {fetch: vi.fn().mockResolvedValue({permissions: {has: vi.fn().mockReturnValue(true)}})}, channels: {cache: channelCache, create}, } as unknown as Guild const {interaction, editReply} = makeInteraction({guild, userId}) @@ -602,6 +613,427 @@ describe('executeAddProject', () => { }) }) + // ------------------------------------------------------------------------- + // Invoking-user authorization gate + // ------------------------------------------------------------------------- + + describe('PRE_FLIGHT: user authorization', () => { + it('rejects user without guild-level ManageChannels and does not proceed to clone', async () => { + // #given — members.fetch resolves to a member whose guild-level permissions lack ManageChannels + const userId = uniqueUserId() + const clone = vi.fn() + const unauthorizedMember = {permissions: {has: vi.fn().mockReturnValue(false)}} + const guild = makeGuild('bot-user-id', true, []) + ;(guild.members as unknown as {fetch: ReturnType}).fetch = vi + .fn() + .mockResolvedValue(unauthorizedMember) + const {interaction, editReply} = makeInteraction({userId, guild}) + const deps = makeDeps({workspaceClient: makeWorkspaceClient({clone})}) + + // #when + await run(interaction, deps) + + // #then — rejected with permission message; clone never invoked + expect(lastEditReplyContent(editReply)).toContain('Manage Channels') + expect(clone).not.toHaveBeenCalled() + }) + + it('denies a user whose ManageChannels comes only from a channel overwrite, not guild-level', async () => { + // #given — guild-level permissions.has returns false (no base guild ManageChannels), + // even though a channel-scoped overwrite might grant it in a real Discord server. + const userId = uniqueUserId() + const clone = vi.fn() + const channelOverwriteOnlyMember = {permissions: {has: vi.fn().mockReturnValue(false)}} + const guild = makeGuild('bot-user-id', true, []) + ;(guild.members as unknown as {fetch: ReturnType}).fetch = vi + .fn() + .mockResolvedValue(channelOverwriteOnlyMember) + const {interaction, editReply} = makeInteraction({userId, guild}) + const deps = makeDeps({workspaceClient: makeWorkspaceClient({clone})}) + + // #when + await run(interaction, deps) + + // #then — rejected; channel-scoped overwrite does NOT bypass the guild-level gate + expect(lastEditReplyContent(editReply)).toContain('Manage Channels') + expect(clone).not.toHaveBeenCalled() + }) + + it('allows user WITH guild-level ManageChannels to proceed past the authorization gate', async () => { + // #given — members.fetch resolves to a member with guild-level ManageChannels + const userId = uniqueUserId() + const {PermissionFlagsBits} = await import('discord.js') + const authorizedMember = { + permissions: {has: vi.fn().mockImplementation((flag: bigint) => flag === PermissionFlagsBits.ManageChannels)}, + } + const {channel} = makeTextChannel('testrepo') + const guild = makeGuild('bot-user-id', true, [], channel) + ;(guild.members as unknown as {fetch: ReturnType}).fetch = vi + .fn() + .mockResolvedValue(authorizedMember) + const {interaction, editReply} = makeInteraction({userId, guild}) + const deps = makeDeps() + + // #when + await run(interaction, deps) + + // #then — proceeds through all phases and reaches READY + expect(lastEditReplyContent(editReply)).toContain('Ready') + }) + + it('fail-closed: members.fetch rejection denies access and does not throw', async () => { + // #given — members.fetch rejects (e.g. network error, member not in guild) + const userId = uniqueUserId() + const clone = vi.fn() + const guild = makeGuild('bot-user-id', true, []) + ;(guild.members as unknown as {fetch: ReturnType}).fetch = vi + .fn() + .mockRejectedValue(new Error('Unknown Member')) + const {interaction, editReply} = makeInteraction({userId, guild}) + const deps = makeDeps({workspaceClient: makeWorkspaceClient({clone})}) + + // #when — must resolve without throwing + await run(interaction, deps) + + // #then — fail-closed: denied, clone never invoked + expect(lastEditReplyContent(editReply)).toContain('Manage Channels') + expect(clone).not.toHaveBeenCalled() + }) + }) + + // ------------------------------------------------------------------------- + // repo-exists never emits deletion instructions + // ------------------------------------------------------------------------- + + describe('CLONING: repo-exists safe messaging', () => { + it('repo-exists + binding found → redirects to bound channel, no rm -rf', async () => { + // #given — clone fails with repo-exists; bindings store returns existing binding + const userId = uniqueUserId() + const clone = vi.fn().mockResolvedValue(err({kind: 'clone-error', code: 'repo-exists'})) + const existingBinding = { + owner: 'testowner', + repo: 'testrepo', + channelId: 'ch-bound-456', + channelName: 'testrepo', + workspacePath: '/workspace/repos/testowner/testrepo', + createdAt: '2026-01-01T00:00:00.000Z', + createdByDiscordId: 'user-original', + } + // getBindingByRepo is called twice: once in PRE_FLIGHT (returns null), once in repo-exists handler (returns binding) + const getBindingByRepo = vi.fn().mockResolvedValueOnce(ok(null)).mockResolvedValue(ok(existingBinding)) + const {interaction, editReply} = makeInteraction({userId}) + const deps = makeDeps({ + workspaceClient: makeWorkspaceClient({clone}), + bindingsStore: makeBindingsStore({getBindingByRepo}), + }) + + // #when + await run(interaction, deps) + + // #then — redirects to existing channel with exact mention token; NEVER instructs deletion + const reply = lastEditReplyContent(editReply) + expect(reply).toContain('<#ch-bound-456>') + for (const content of allEditReplies(editReply)) { + expect(content).not.toContain('rm -rf') + } + }) + + it('repo-exists + NO binding → RESUMES: proceeds to channel creation and binding write', async () => { + // #given — clone fails with repo-exists; bindings store returns null (no binding yet) + // This is the post-clone partial-failure recovery path. + const userId = uniqueUserId() + const clone = vi.fn().mockResolvedValue(err({kind: 'clone-error', code: 'repo-exists'})) + // PRE_FLIGHT returns null; repo-exists handler also returns null (no binding written yet) + const getBindingByRepo = vi.fn().mockResolvedValue(ok(null)) + const createBinding = vi.fn().mockResolvedValue(ok({primaryEtag: 'e1', indexEtag: 'e2'})) + const {channel, send} = makeTextChannel('owner-repo') + const guild = makeGuild('bot-user-id', true, [], channel) + const {interaction, editReply} = makeInteraction({ + url: 'https://github.com/owner/repo', + guild, + userId, + }) + const deps = makeDeps({ + workspaceClient: makeWorkspaceClient({clone}), + bindingsStore: makeBindingsStore({getBindingByRepo, createBinding}), + }) + + // #when + await run(interaction, deps) + + // #then — channel was created (guild.channels.create was called) + const guildCreate = (guild.channels as unknown as {create: ReturnType}).create + expect(guildCreate).toHaveBeenCalled() + // #and — binding was written with the canonical workspace path + const bindingCall = (createBinding.mock.calls as [{workspacePath: string; owner: string; repo: string}][])[0] + expect(bindingCall?.[0]?.workspacePath).toBe('/workspace/repos/owner/repo') + expect(bindingCall?.[0]?.owner).toBe('owner') + expect(bindingCall?.[0]?.repo).toBe('repo') + // #and — READY reply reached + expect(lastEditReplyContent(editReply)).toContain('Ready') + // #and — welcome message sent + expect(send).toHaveBeenCalled() + // #and — clone was only called once (not re-run) + expect(clone).toHaveBeenCalledOnce() + // #and — no rm -rf in any reply + for (const content of allEditReplies(editReply)) { + expect(content).not.toContain('rm -rf') + } + }) + + it('repo-exists + binding store ERRORS (success=false) → internal-error reply, channel NOT created', async () => { + // #given — clone fails with repo-exists; second binding lookup returns an error result (store error) + // Store errors must NOT resume — resuming on a store outage risks orphan channels. + const userId = uniqueUserId() + const clone = vi.fn().mockResolvedValue(err({kind: 'clone-error', code: 'repo-exists'})) + const storeError = new Error('S3 timeout') + // PRE_FLIGHT returns null (ok); repo-exists handler returns err (store failure Result) + const getBindingByRepo = vi.fn().mockResolvedValueOnce(ok(null)).mockResolvedValue(err(storeError)) + const createBinding = vi.fn().mockResolvedValue(ok({primaryEtag: 'e1', indexEtag: 'e2'})) + const {channel} = makeTextChannel('owner-repo') + const guild = makeGuild('bot-user-id', true, [], channel) + const {interaction, editReply} = makeInteraction({ + url: 'https://github.com/owner/repo', + guild, + userId, + }) + const deps = makeDeps({ + workspaceClient: makeWorkspaceClient({clone}), + bindingsStore: makeBindingsStore({getBindingByRepo, createBinding}), + }) + + // #when + await run(interaction, deps) + + // #then — internal-error reply (not a resume) + expect(lastEditReplyContent(editReply)).toContain('Internal error checking existing bindings') + // #and — channel was NOT created (no resume on store error) + const guildCreate = (guild.channels as unknown as {create: ReturnType}).create + expect(guildCreate).not.toHaveBeenCalled() + // #and — createBinding was NOT called + expect(createBinding).not.toHaveBeenCalled() + // #and — no rm -rf in any reply + for (const content of allEditReplies(editReply)) { + expect(content).not.toContain('rm -rf') + } + }) + + it('repo-exists + getBindingByRepo REJECTS → internal-error reply, channel NOT created', async () => { + // #given — clone fails with repo-exists; getBindingByRepo throws (network-level rejection) + // A store rejection must NOT resume — we cannot confirm binding absence, so resuming risks orphan channels. + const userId = uniqueUserId() + const clone = vi.fn().mockResolvedValue(err({kind: 'clone-error', code: 'repo-exists'})) + // PRE_FLIGHT returns null; repo-exists handler rejects entirely + const getBindingByRepo = vi.fn().mockResolvedValueOnce(ok(null)).mockRejectedValue(new Error('connection reset')) + const createBinding = vi.fn().mockResolvedValue(ok({primaryEtag: 'e1', indexEtag: 'e2'})) + const {channel} = makeTextChannel('owner-repo') + const guild = makeGuild('bot-user-id', true, [], channel) + const {interaction, editReply} = makeInteraction({ + url: 'https://github.com/owner/repo', + guild, + userId, + }) + const deps = makeDeps({ + workspaceClient: makeWorkspaceClient({clone}), + bindingsStore: makeBindingsStore({getBindingByRepo, createBinding}), + }) + + // #when — must resolve without throwing even though getBindingByRepo rejects + await expect(run(interaction, deps)).resolves.toBeUndefined() + + // #then — internal-error reply (not a resume) + expect(lastEditReplyContent(editReply)).toContain('Internal error checking existing bindings') + // #and — channel was NOT created (no resume on store rejection) + const guildCreate = (guild.channels as unknown as {create: ReturnType}).create + expect(guildCreate).not.toHaveBeenCalled() + // #and — createBinding was NOT called + expect(createBinding).not.toHaveBeenCalled() + // #and — no rm -rf in any reply + for (const content of allEditReplies(editReply)) { + expect(content).not.toContain('rm -rf') + } + }) + + it('full retry-after-partial-failure: first invocation binding write fails, second invocation resumes successfully', async () => { + // #given — first invocation: clone succeeds, binding write fails + const userId1 = uniqueUserId() + const {channel: channel1, send: send1} = makeTextChannel('owner-repo') + const guild1 = makeGuild('bot-user-id', true, [], channel1) + const {interaction: interaction1, editReply: editReply1} = makeInteraction({ + url: 'https://github.com/owner/repo', + guild: guild1, + userId: userId1, + }) + + const cloneFresh = vi + .fn() + .mockResolvedValue(ok({ok: true, path: '/workspace/repos/owner/repo', commit: 'abc123'})) + const bindingStoreError = new Error('S3 write failed') + const createBindingFail = vi.fn().mockResolvedValue(err(bindingStoreError)) + const getBindingEmpty = vi.fn().mockResolvedValue(ok(null)) + + const deps1 = makeDeps({ + workspaceClient: makeWorkspaceClient({clone: cloneFresh}), + bindingsStore: makeBindingsStore({getBindingByRepo: getBindingEmpty, createBinding: createBindingFail}), + }) + + // #when — first invocation runs: clone succeeds but binding write fails + await run(interaction1, deps1) + + // #then — first invocation fails with a binding error message + expect(lastEditReplyContent(editReply1)).toContain('Failed to write binding') + + // #given — second invocation: workspace agent returns repo-exists (clone done), no binding in store + const userId2 = uniqueUserId() + const {channel: channel2, send: send2} = makeTextChannel('owner-repo') + const guild2 = makeGuild('bot-user-id', true, [], channel2) + const {interaction: interaction2, editReply: editReply2} = makeInteraction({ + url: 'https://github.com/owner/repo', + guild: guild2, + userId: userId2, + }) + + const cloneRepoExists = vi.fn().mockResolvedValue(err({kind: 'clone-error', code: 'repo-exists'})) + const createBindingOk = vi.fn().mockResolvedValue(ok({primaryEtag: 'e1', indexEtag: 'e2'})) + const getBindingStillEmpty = vi.fn().mockResolvedValue(ok(null)) + + const deps2 = makeDeps({ + workspaceClient: makeWorkspaceClient({clone: cloneRepoExists}), + bindingsStore: makeBindingsStore({getBindingByRepo: getBindingStillEmpty, createBinding: createBindingOk}), + }) + + // #when — second invocation runs: resumes from CREATING_CHANNEL + await run(interaction2, deps2) + + // #then — second invocation reaches READY + expect(lastEditReplyContent(editReply2)).toContain('Ready') + // #and — binding was written with canonical path + const calls = (createBindingOk.mock.calls as [{workspacePath: string}][])[0] + expect(calls?.[0]?.workspacePath).toBe('/workspace/repos/owner/repo') + // #and — welcome message sent in second invocation + expect(send2).toHaveBeenCalled() + // #and — clone NOT re-run in second invocation (repo-exists path, not fresh clone) + expect(cloneRepoExists).toHaveBeenCalledOnce() + // send1 was never reached (binding failed in first invocation) — intentionally unused + expect(send1).not.toHaveBeenCalled() + }) + }) + + // ------------------------------------------------------------------------- + // Concurrent resume race coverage (FIX 2) + // ------------------------------------------------------------------------- + + describe('CLONING: concurrent resume — BINDING_EXISTS_ERROR loser path', () => { + it('second concurrent resume hits BINDING_EXISTS_ERROR → bounded error reply, no throw, no re-clone', async () => { + // #given — both invocations see repo-exists + no binding (clone done, no binding yet) + // First createBinding succeeds; second gets BINDING_EXISTS_ERROR (atomic IfNoneMatch write) + const userId1 = uniqueUserId() + const userId2 = uniqueUserId() + + const clone = vi.fn().mockResolvedValue(err({kind: 'clone-error', code: 'repo-exists'})) + const getBindingByRepo = vi.fn().mockResolvedValue(ok(null)) + + const bindingExistsError = Object.assign(new Error('binding already exists'), {code: 'BINDING_EXISTS_ERROR'}) + let createBindingCallCount = 0 + const createBinding = vi.fn().mockImplementation(async () => { + createBindingCallCount++ + if (createBindingCallCount === 1) return ok({primaryEtag: 'e1', indexEtag: 'e2'}) + return err(bindingExistsError) + }) + + const {channel: channel1} = makeTextChannel('owner-repo') + const guild1 = makeGuild('bot-user-id', true, [], channel1) + const {interaction: interaction1, editReply: editReply1} = makeInteraction({ + url: 'https://github.com/owner/repo', + guild: guild1, + userId: userId1, + }) + + const {channel: channel2} = makeTextChannel('owner-repo') + const guild2 = makeGuild('bot-user-id', true, [], channel2) + const {interaction: interaction2, editReply: editReply2} = makeInteraction({ + url: 'https://github.com/owner/repo', + guild: guild2, + userId: userId2, + }) + + const deps1 = makeDeps({ + workspaceClient: makeWorkspaceClient({clone}), + bindingsStore: makeBindingsStore({getBindingByRepo, createBinding}), + }) + const deps2 = makeDeps({ + workspaceClient: makeWorkspaceClient({clone}), + bindingsStore: makeBindingsStore({getBindingByRepo, createBinding}), + }) + + // #when — first invocation wins the binding race + await expect(run(interaction1, deps1)).resolves.toBeUndefined() + + // #when — second invocation loses: createBinding returns BINDING_EXISTS_ERROR + await expect(run(interaction2, deps2)).resolves.toBeUndefined() + + // #then — winner reaches READY + expect(lastEditReplyContent(editReply1)).toContain('Ready') + + // #then — loser gets the concurrent-bound message (not a throw, not a re-clone) + const loserReply = lastEditReplyContent(editReply2) + expect(loserReply).toContain('bound by a concurrent request') + expect(loserReply).toContain('manual cleanup may be needed') + + // #and — clone was called twice (two independent resume invocations), not re-run internally + expect(clone).toHaveBeenCalledTimes(2) + + // #and — createBinding was called twice (both invocations attempted the write) + expect(createBinding).toHaveBeenCalledTimes(2) + }) + }) + + describe('shutdown gate (Part 2)', () => { + afterEach(() => { + __resetShuttingDownForTests() + }) + + it('isShuttingDown returns true → replies "fro-bot is restarting" and does NOT call clone', async () => { + // #given — bot is draining shutdown + const userId = uniqueUserId() + const clone = vi.fn() + const {interaction, editReply} = makeInteraction({ + url: 'https://github.com/owner/repo', + userId, + }) + const deps = makeDeps({ + workspaceClient: makeWorkspaceClient({clone}), + isShuttingDown: () => true, + }) + + // #when + await run(interaction, deps) + + // #then — user gets restart message + expect(lastEditReplyContent(editReply)).toContain('fro-bot is restarting') + // #and — clone was never called (no new work started) + expect(clone).not.toHaveBeenCalled() + }) + + it('isShuttingDown absent (default) → proceeds normally', async () => { + // #given — no isShuttingDown dep injected (optional field left absent) + const userId = uniqueUserId() + const clone = vi.fn().mockResolvedValue(ok({ok: true, path: '/workspace/repos/owner/repo', commit: 'abc123'})) + const {interaction} = makeInteraction({ + url: 'https://github.com/owner/repo', + userId, + }) + // makeDeps without isShuttingDown — the dep is optional; absence defaults to () => false + const deps = makeDeps({workspaceClient: makeWorkspaceClient({clone})}) + + // #when + await run(interaction, deps) + + // #then — clone was called (command proceeded normally) + expect(clone).toHaveBeenCalled() + }) + }) + describe('canonicalization', () => { it('owner/Repo and owner/repo produce the same binding key', async () => { // #given diff --git a/packages/gateway/src/discord/commands/add-project.ts b/packages/gateway/src/discord/commands/add-project.ts index d2ca42cb..e608cc42 100644 --- a/packages/gateway/src/discord/commands/add-project.ts +++ b/packages/gateway/src/discord/commands/add-project.ts @@ -10,7 +10,7 @@ * - Owner/repo are canonicalized to lowercase before any lookup or write. */ -import type {ChatInputCommandInteraction, PermissionsBitField} from 'discord.js' +import type {ChatInputCommandInteraction, Guild, PermissionsBitField} from 'discord.js' import type {BindingsStore} from '../../bindings/store.js' import type {AppClient} from '../../github/app-client.js' import type {WorkspaceClient} from '../../workspace-api/client.js' @@ -19,6 +19,7 @@ import {PermissionFlagsBits} from 'discord.js' import {Effect} from 'effect' import {AppNotInstalledError} from '../../github/app-client.js' +import {workspaceRepoPath} from '../../workspace-api/client.js' import {createChannelWithCollisionSuffix} from '../channels.js' // --------------------------------------------------------------------------- @@ -37,6 +38,12 @@ export interface AddProjectDeps { readonly warn: (msg: string, meta?: Record) => void readonly error: (msg: string, meta?: Record) => void } + /** + * Optional. Defaults to `() => false` when absent so callers that don't inject it + * are unaffected. When present, returning `true` causes the command to bail early + * with a user-friendly restart message instead of starting work that will be hard-killed. + */ + readonly isShuttingDown?: () => boolean } // --------------------------------------------------------------------------- @@ -143,6 +150,26 @@ function botHasRequiredPermissions(appPermissions: PermissionsBitField | null): return appPermissions.has(PermissionFlagsBits.ManageChannels) && appPermissions.has(PermissionFlagsBits.SendMessages) } +// Invoking-user authorization gate: prevents privilege amplification where a member coerces +// the bot's broader ManageChannels permission to create channels they could not create themselves. +// Uses guild-level base permissions (no channel overwrites) to prevent a user with a +// channel-scoped ManageChannels overwrite from bypassing the gate. +async function userIsAuthorized(guild: Guild, userId: string, logger: AddProjectDeps['logger']): Promise { + try { + // guild.members.fetch() is a REST call — works without the privileged GuildMembers intent. + // Do NOT use guild.members.cache.get() — returns undefined without the intent. + const member = await guild.members.fetch(userId) + // member.permissions is the guild-level base permission set (no channel overwrites). + return member.permissions.has(PermissionFlagsBits.ManageChannels) + } catch (error) { + // Fail closed: if we cannot resolve the member's guild permissions, deny. + logger.warn('add-project: member permission resolution failed', { + error: error instanceof Error ? error.message : String(error), + }) + return false + } +} + // --------------------------------------------------------------------------- // Interaction window guard (14-minute limit) // --------------------------------------------------------------------------- @@ -197,6 +224,16 @@ async function runAddProject(interaction: ChatInputCommandInteraction, deps: Add // Defer reply (ephemeral — setup thread is operationally sensitive) await interaction.deferReply({ephemeral: true}) + // Shutdown gate — placed after deferReply so Discord gets its mandatory ack (<3s). + // Refuse new work during draining shutdown; resume (Part 1) heals any hard-killed run. + const shuttingDownCheck = deps.isShuttingDown ?? (() => false) + if (shuttingDownCheck() === true) { + await interaction.editReply({ + content: 'fro-bot is restarting. Please try `/fro-bot add-project` again in a moment.', + }) + return + } + const guild = interaction.guild if (guild === null) { await interaction.editReply({content: 'This command can only be used in a server.'}) @@ -204,7 +241,7 @@ async function runAddProject(interaction: ChatInputCommandInteraction, deps: Add } // Check bot permissions - if (!botHasRequiredPermissions(interaction.appPermissions)) { + if (botHasRequiredPermissions(interaction.appPermissions) === false) { logger.warn('add-project: missing bot permissions', {correlationId, phase}) await interaction.editReply({ content: `fro-bot needs **Manage Channels** and **Send Messages** permissions. Re-invite the bot at: ${installUrl}`, @@ -212,6 +249,17 @@ async function runAddProject(interaction: ChatInputCommandInteraction, deps: Add return } + // Runtime authorization check — invoking user must hold ManageChannels. + // setDefaultMemberPermissions is NOT used — it would gate the entire /fro-bot parent + // command (including /ping). This is a scoped runtime check per subcommand. + if ((await userIsAuthorized(guild, interaction.user.id, logger)) === false) { + logger.warn('add-project: unauthorized user', {correlationId, phase}) + await interaction.editReply({ + content: 'You need the **Manage Channels** permission to use this command.', + }) + return + } + // Parse and validate URL const rawUrl = interaction.options.getString('url', true) const parsed = parseGitHubUrl(rawUrl) @@ -308,6 +356,12 @@ async function runAddProject(interaction: ChatInputCommandInteraction, deps: Add } const cloneResult = await workspaceClient.clone({owner, repo, token}) + // Intentionally uninitialized: TypeScript's definite-assignment analysis enforces that + // every path below either assigns workspacePath (fresh clone success, or repo-exists resume) + // or returns. A sentinel default would defeat that compile-time guard — if a future edit + // drops a return in an error branch, tsc errors here instead of passing an empty path to + // channel creation. Do not initialize. + let workspacePath: string if (cloneResult.success === false) { const errorKind = cloneResult.error.kind logger.warn('add-project: clone failed', {correlationId, phase, owner, repo, errorKind, outcome: 'error'}) @@ -319,39 +373,70 @@ async function runAddProject(interaction: ChatInputCommandInteraction, deps: Add content: 'The workspace volume is out of space. Free disk by removing unused repos under `/workspace/repos` and retry.', }) + return } else if (code === 'repo-exists') { - await interaction.editReply({ - content: `The repo \`${owner}/${repo}\` already exists in the workspace. Remove it with \`rm -rf /workspace/repos/${owner}/${repo}\` and retry.`, - }) + // repo-exists means the workspace-agent completed the clone atomically (temp dir renamed + // to destPath). Decide: redirect (already bound), resume (clone exists, no binding), or + // error (store unavailable — do NOT resume; orphan risk). + // + // Never emit deletion instructions — we must never instruct the user to rm -rf. + let existing: Awaited> + try { + existing = await bindingsStore.getBindingByRepo(owner, repo) + } catch { + await interaction.editReply({content: 'Internal error checking existing bindings. Please retry in a moment.'}) + return + } + if (existing.success === false) { + await interaction.editReply({content: 'Internal error checking existing bindings. Please retry in a moment.'}) + return + } + if (existing.data !== null) { + // Genuinely already bound — redirect to the bound channel. Nothing to resume. + await interaction.editReply({ + content: `\`${owner}/${repo}\` is already set up in <#${existing.data.channelId}>.`, + }) + return + } + // Clone exists but no binding — a prior run failed after CLONING. Resume from CREATING_CHANNEL. + // Concurrency: a racing invocation also resuming will have its createBinding rejected with + // BINDING_EXISTS_ERROR (atomic IfNoneMatch write), handled below (~line ~500). The losing run + // may leave an orphan channel — accepted v1 fallout (see docs/solutions orchestration-patterns). + workspacePath = workspaceRepoPath(owner, repo) + logger.info('add-project phase', {phase: 'CLONING', outcome: 'resumed', owner, repo, correlationId}) + // fall through — do NOT return } else { await interaction.editReply({ content: `Clone failed (${code}). Check workspace-agent logs for details.`, }) + return } } else if (errorKind === 'timeout') { await interaction.editReply({content: 'Clone timed out (5 minutes). The repo may be very large. Retry.'}) + return } else if (errorKind === 'response-mismatch') { logger.error('add-project: response-mismatch from workspace agent', {correlationId, phase, owner, repo}) await interaction.editReply({ content: 'Internal error: workspace agent returned unexpected response. Contact operator.', }) + return } else { await interaction.editReply({content: `Clone failed (${errorKind}). Check workspace-agent connectivity.`}) + return } - return + } else { + workspacePath = cloneResult.data.path + logger.info('add-project phase', { + correlationId, + phase, + owner, + repo, + workspacePath, + outcome: 'success', + durationMs: Date.now() - startTime, + }) } - const workspacePath = cloneResult.data.path - logger.info('add-project phase', { - correlationId, - phase, - owner, - repo, - workspacePath, - outcome: 'success', - durationMs: Date.now() - startTime, - }) - // --------------------------------------------------------------------------- // CREATING_CHANNEL // --------------------------------------------------------------------------- diff --git a/packages/gateway/src/program.ts b/packages/gateway/src/program.ts index 57043fb4..698842a9 100644 --- a/packages/gateway/src/program.ts +++ b/packages/gateway/src/program.ts @@ -10,7 +10,7 @@ import {createDiscordClient} from './discord/client.js' import {dispatchCommand, getCommandRegistry, registerSlashCommands} from './discord/commands/index.js' import {handleMention} from './discord/mentions.js' import {createAppClient} from './github/app-client.js' -import {installShutdownHandlers} from './shutdown.js' +import {installShutdownHandlers, isShuttingDown} from './shutdown.js' import {createWorkspaceClient} from './workspace-api/client.js' // --------------------------------------------------------------------------- @@ -120,6 +120,7 @@ export function makeGatewayProgram(deps: GatewayProgramDeps, config: GatewayConf workspaceClient, installUrl: config.gatewayGitHubAppInstallUrl, logger: addProjectLogger, + isShuttingDown, } const registry = getCommandRegistry(commandDeps) diff --git a/packages/gateway/src/shutdown.test.ts b/packages/gateway/src/shutdown.test.ts index d12a744a..9797f4a0 100644 --- a/packages/gateway/src/shutdown.test.ts +++ b/packages/gateway/src/shutdown.test.ts @@ -5,7 +5,7 @@ import process from 'node:process' import {afterEach, beforeEach, describe, expect, it, vi} from 'vitest' -import {__resetShuttingDownForTests, DEFAULT_DRAIN_MS, installShutdownHandlers} from './shutdown.js' +import {__resetShuttingDownForTests, DEFAULT_DRAIN_MS, installShutdownHandlers, isShuttingDown} from './shutdown.js' // --------------------------------------------------------------------------- // Helpers @@ -208,3 +208,39 @@ describe('installShutdownHandlers', () => { expect(destroy1Count + destroy2Count).toBe(1) }) }) + +describe('isShuttingDown', () => { + beforeEach(() => { + vi.useFakeTimers() + __resetShuttingDownForTests() + }) + + afterEach(() => { + vi.useRealTimers() + __resetShuttingDownForTests() + }) + + it('returns false before any signal is received', () => { + // #given — no signal fired, module freshly reset + // #when / #then + expect(isShuttingDown()).toBe(false) + }) + + it('returns true after SIGTERM handler fires', async () => { + // #given — install handlers with a slow destroy so we can observe state mid-shutdown + const {logger} = makeLogger() + const client = makeClient(100_000) // hangs + const exitSpy = vi.spyOn(process, 'exit').mockImplementation(() => undefined as never) + const cleanup = installShutdownHandlers(client, logger, 200_000) + + // #when — emit SIGTERM + process.emit('SIGTERM') + await vi.advanceTimersByTimeAsync(1) // let the handler synchronously set shuttingDown + + // #then + expect(isShuttingDown()).toBe(true) + + cleanup() + exitSpy.mockRestore() + }) +}) diff --git a/packages/gateway/src/shutdown.ts b/packages/gateway/src/shutdown.ts index 51d0029f..5f709b78 100644 --- a/packages/gateway/src/shutdown.ts +++ b/packages/gateway/src/shutdown.ts @@ -32,6 +32,17 @@ export function __resetShuttingDownForTests(): void { shuttingDown = false } +/** + * Returns true if a shutdown signal has been received and the gateway is draining. + * Intended for command handlers to check before starting long-running work. + * + * This is a pure getter — production code may call it; only `__resetShuttingDownForTests` + * is test-only. + */ +export function isShuttingDown(): boolean { + return shuttingDown +} + /** * Install SIGTERM and SIGINT handlers that gracefully drain the Discord client. * diff --git a/packages/gateway/src/workspace-api/client.test.ts b/packages/gateway/src/workspace-api/client.test.ts index 46baf7ed..f86d5d8c 100644 --- a/packages/gateway/src/workspace-api/client.test.ts +++ b/packages/gateway/src/workspace-api/client.test.ts @@ -113,10 +113,10 @@ describe('createWorkspaceClient', () => { vi.unstubAllGlobals() }) - it('accepts path match case-insensitively', async () => { - // #given + it('accepts lowercase owner/repo matching the response path', async () => { + // #given — owner/repo arrive already lowercased from the caller (add-project.ts canonicalizes) const client = makeClient() - const req = makeRequest({owner: 'TestOwner', repo: 'TestRepo'}) + const req = makeRequest({owner: 'testowner', repo: 'testrepo'}) const fetchMock = mockFetch({ ok: true, json: async () => ({ok: true, path: '/workspace/repos/testowner/testrepo', commit: 'abc123'}), @@ -130,6 +130,79 @@ describe('createWorkspaceClient', () => { expect(result.success).toBe(true) vi.unstubAllGlobals() }) + + // Strict prefix validation — suffix-only check accepted adversarial paths. + it('rejects adversarial path /etc/passwd/testowner/testrepo', async () => { + // #given — suffix matches but path root is wrong (privilege escalation attempt) + const client = makeClient() + const req = makeRequest() + const fetchMock = mockFetch({ + ok: true, + json: async () => ({ok: true, path: '/etc/passwd/testowner/testrepo', commit: 'abc123'}), + }) + vi.stubGlobal('fetch', fetchMock) + + // #when + const result = await client.clone(req) + + // #then — must reject: path doesn't start with /workspace/repos + expect(result).toEqual(err({kind: 'response-mismatch'})) + vi.unstubAllGlobals() + }) + + it('accepts exactly /workspace/repos/{owner}/{repo}', async () => { + // #given — exact expected path + const client = makeClient() + const req = makeRequest() + const fetchMock = mockFetch({ + ok: true, + json: async () => ({ok: true, path: '/workspace/repos/testowner/testrepo', commit: 'abc123'}), + }) + vi.stubGlobal('fetch', fetchMock) + + // #when + const result = await client.clone(req) + + // #then — must succeed + expect(result.success).toBe(true) + vi.unstubAllGlobals() + }) + + it('rejects path with extra trailing segment /workspace/repos/testowner/testrepo/extra', async () => { + // #given — suffix matches but there's an extra path segment after owner/repo + const client = makeClient() + const req = makeRequest() + const fetchMock = mockFetch({ + ok: true, + json: async () => ({ok: true, path: '/workspace/repos/testowner/testrepo/extra', commit: 'abc123'}), + }) + vi.stubGlobal('fetch', fetchMock) + + // #when + const result = await client.clone(req) + + // #then — exact equality rejects the extra segment + expect(result).toEqual(err({kind: 'response-mismatch'})) + vi.unstubAllGlobals() + }) + + it('rejects path with uppercase workspace root /WORKSPACE/repos/owner/repo', async () => { + // #given — case-variant root; lowercasing response path would bypass validation + const client = makeClient() + const req = makeRequest() + const fetchMock = mockFetch({ + ok: true, + json: async () => ({ok: true, path: '/WORKSPACE/repos/testowner/testrepo', commit: 'abc123'}), + }) + vi.stubGlobal('fetch', fetchMock) + + // #when + const result = await client.clone(req) + + // #then — exact comparison rejects the case-variant root + expect(result).toEqual(err({kind: 'response-mismatch'})) + vi.unstubAllGlobals() + }) }) describe('clone-error codes', () => { diff --git a/packages/gateway/src/workspace-api/client.ts b/packages/gateway/src/workspace-api/client.ts index 69d202ef..8011cf02 100644 --- a/packages/gateway/src/workspace-api/client.ts +++ b/packages/gateway/src/workspace-api/client.ts @@ -24,6 +24,21 @@ export interface WorkspaceClient { const DEFAULT_TIMEOUT_MS = 300_000 // 5 minutes +// Mirrors WORKSPACE_REPOS_ROOT in apps/workspace-agent/src/clone.ts (separate +// package/container boundary, so not imported). Module-private: callers use +// workspaceRepoPath() rather than composing the root themselves. +const EXPECTED_WORKSPACE_ROOT = '/workspace/repos' + +/** + * Returns the canonical workspace path for a given owner/repo pair. + * Single source of truth shared by the client validator and add-project resume logic. + * + * owner and repo MUST already be lowercased (canonical form) before calling this. + */ +export function workspaceRepoPath(owner: string, repo: string): string { + return `${EXPECTED_WORKSPACE_ROOT}/${owner}/${repo}` +} + /** * Create a workspace-agent HTTP client. * @@ -86,9 +101,12 @@ export function createWorkspaceClient(options: WorkspaceClientOptions): Workspac return err({kind: 'clone-error', code: parsed.error}) } - // Response integrity check: path must end with /{owner}/{repo} (case-insensitive). - const expectedSuffix = `/${owner.toLowerCase()}/${repo.toLowerCase()}` - if (!parsed.path.toLowerCase().endsWith(expectedSuffix)) { + // Strict full-path equality check (root + owner + repo). + // The prior suffix-only check accepted adversarial paths like /etc/passwd/owner/repo. + // owner/repo arrive already lowercased from add-project.ts; lowercasing the response + // path would let a case-variant root bypass validation. + const expectedPath = workspaceRepoPath(owner, repo) + if (parsed.path !== expectedPath) { return err({kind: 'response-mismatch'}) }