PostHog · edwinyjlim · May 9, 2026 · May 10, 2026 · May 10, 2026 · May 10, 2026
diff --git a/transformation-config/skills/events-audit/config.yaml b/transformation-config/skills/events-audit/config.yaml
@@ -0,0 +1,14 @@
+type: docs-only
+template: description.md
+description: Audit PostHog events in a codebase — produce an inventory of every captured event mapped to its file, area, and 30-day volume for the product team to query
+tags: [analytics, audit, best-practices]
+references:
+  preamble: "**Read ONLY this file.** Do not read any other reference file until this one tells you to."
+shared_docs:
+  - https://posthog.com/docs/product-analytics/best-practices.md
+  - https://posthog.com/docs/getting-started/identify-users.md
+variants:
+  - id: all
+    display_name: PostHog events audit
+    tags: [analytics, audit, best-practices]
+    docs_urls: []
diff --git a/transformation-config/skills/events-audit/description.md b/transformation-config/skills/events-audit/description.md
@@ -0,0 +1,70 @@
+# PostHog events audit
+
+This skill produces a product-browseable report of every PostHog event your code captures, mapped to the codebase area, and enriched with 30-day volume from PostHog.
+
+## Workflow
+
+The audit runs as a 6-step chain:
+
+1. Detect SDK
+2. Scan capture sites (grep only)
+3. Enrich (subagent fan-out — the only step that reads source files)
+4. Query PostHog for volume
+5. Write report
+6. Create dashboard
+
+Each step file points to the next. Run them in order. Don't explore the source tree on your own.
+
+**Start by reading `references/1-detect.md`** (relative to this skill's directory – typically `.claude/skills/events-audit/references/1-detect.md`). Don't read ahead. Don't re-read a step once you've passed it. Don't re-read SKILL.md.
+
+Step 1 seeds the audit checklist as its first action. Don't assume the runtime pre-seeds it.
+
+## The audit checklist
+
+The audit checklist has three shared checks in addition to the event map audit: `identity-segmentation`, `coverage-map`, `data-quality`. Finish each one. Don't invent new ids.
+
+The checklist lives at `.posthog-audit-checks.json`. It's owned by MCP tools – **never `Write` it directly**,
+
+## The events inventory
+
+A second file, `.posthog-events-inventory.json`, is the working event inventory for steps 2 through 4. It holds the capture sites with derived `package`/`area`/`route`/`enclosing` fields, event names, properties, and per-event volume from PostHog. 
+
+It's **not** MCP-owned – no `audit_*` tool guards it. The inventory is **transient scratch state**, not a deliverable: step 5 deletes `.posthog-audit-checks.json` once the report is written, and step 6 deletes the inventory after the optional dashboard step. The report is the only artifact the user keeps.
+
+Check entry shape:
+
+- `id` - stable kebab-case slug. The three shared ids are `identity-segmentation`, `coverage-map`, `data-quality`.
+- `area` - short group name. Shared entries use `Identity`, `Coverage`, `Data quality`.
+- `label` - short human name.
+- `status` - `pending` | `pass` | `error` | `warning` | `suggestion`.
+- `file` - optional `path:line` for findings tied to a location.
+- `details` - Markdown bulleted summary in plain language. Describe state and the product questions blocked. Don't render `status` as a grade in the report; the enum is for filter logic only.
+
+## Key principles
+
+- **Show your evidence.** Cite `file:line` for every non-pass finding.
+- **Frame findings as product questions.** Every finding describes *what product question or insight it blocks*, not what code rule it breaks.
+- **Hand the reader the map. Don't tell the story for them.** The deliverable is a single report with three short qualitative checks plus a few suggested follow-ups. The reader clusters events into flows on demand by asking targeted follow-up questions about the report — the skill doesn't do that synthesis upfront.
+
+## Live activity – `[STATUS]`
+
+The "Working on …" banner reads from `[STATUS]` lines you emit in plain text. Whenever you start a sub-step, write a line like:
+
+```
+[STATUS] Scanning capture sites
+```
+
+The wizard catches these and updates the spinner. Use them freely – they're cheap. Each step file lists the exact strings to emit. Don't invent your own.
+
+## Abort statuses
+
+Report aborts with `[ABORT]` prefixed messages. The wizard catches these and stops the run – don't halt yourself.
+
+- `[ABORT] No PostHog SDK found`
+- `[ABORT] No capture call sites found in any detected SDK`
+
+MCP failures (project mismatch, query errors, no connection) are **not** abort conditions — step 4 soft-degrades and step 5 renders the report with a `{{mcp_disclaimer}}` callout in place of volume sections. See step 4 for the degradation contract.
+
+## Framework guidelines
+
+{commandments}
diff --git a/transformation-config/skills/events-audit/references/1-detect.md b/transformation-config/skills/events-audit/references/1-detect.md
@@ -0,0 +1,80 @@
+---
+next_step: 2-scan.md
+---
+
+# Step 1 – Detect SDKs
+
+Seed the audit checklist, then find every PostHog SDK in the project and remember which language(s) and framework(s) the rest of the audit will work on. **Read-only on the codebase.** Don't scan code for capture sites – that's step 2.
+
+## Tools
+
+Load via `ToolSearch select:Read,Glob,mcp__wizard-tools__audit_seed_checks,mcp__wizard-tools__audit_resolve_checks` once at the start of this step.
+
+## Status
+
+Emit, in order:
+
+```
+[STATUS] Seeding audit checklist
+[STATUS] Detecting SDKs
+```
+
+## Action
+
+### a. Seed the audit checklist
+
+The checklist lives at `.posthog-audit-checks.json` and renders live in the "Audit plan" tab. **Don't rely on the runtime pre-seeding it** — call `mcp__wizard-tools__audit_seed_checks` directly here. The tool replaces the file atomically, so calling it once at the start of every run is safe.
+
+Pass exactly these three shared checks (`identity-segmentation`, `coverage-map`, `data-quality`):
+
+```json
+{
+  "checks": [
+    { 
+      "id": "identity-segmentation", 
+      "area": "Identity",     
+      "label": "Identity & segmentation", 
+      "status": "pending" 
+    },
+    { 
+      "id": "coverage-map",          
+      "area": "Coverage",     
+      "label": "Coverage map",            
+      "status": "pending" 
+    },
+    { 
+      "id": "data-quality",          
+      "area": "Data quality", 
+      "label": "Data quality",            
+      "status": "pending" 
+    }
+  ]
+}
+```
+
+Don't invent new ids — later steps resolve checks by these exact ids. Don't `Write` the file directly; the MCP tool owns it.
+
+### b. Find PostHog SDKs
+
+`Glob` for the project's dependency manifests across every language PostHog ships an SDK for. The full list:
+
+- `package.json` - npm / pnpm / yarn (Node, web, React, Next.js, Nuxt, Vue, Svelte, Angular, React Native, Expo)
+- `requirements.txt`, `pyproject.toml`, `Pipfile`, `setup.py` – Python (Django, Flask, FastAPI)
+- `Gemfile` - Ruby / Rails
+- `composer.json` - PHP / Laravel
+- `go.mod` - Go
+- `build.gradle`, `build.gradle.kts`, `pom.xml` – Java / Android
+- `Podfile`, `Package.swift` – iOS / Swift
+- `pubspec.yaml` - Flutter / Dart
+- `*.csproj` - .NET
+- `mix.exs` - Elixir
+
+Read enough of them to identify which PostHog SDK the project uses, what version, and what framework it sits on top of.
+
+If the project is a monorepo, you may find multiple PostHog SDKs.
+
+If no PostHog SDK is anywhere in the project, emit `[ABORT] No PostHog SDK found` and stop. The wizard catches `[ABORT]` and terminates the run.
+
+For each dependency manifest, extract every dependency whose name starts with `posthog` (e.g. `posthog`, `posthog-node`, `posthog-js`, `posthog-python`, `posthog-ruby`). Hold `{ sdk, version, manifest, framework }` per SDK in memory. The next step uses this list.
+
+If no PostHog SDK is anywhere, emit `[ABORT] No PostHog SDK found`.
diff --git a/transformation-config/skills/events-audit/references/2-scan.md b/transformation-config/skills/events-audit/references/2-scan.md
@@ -0,0 +1,127 @@
+---
+next_step: 3-enrich.md
+---
+
+# Step 2 – Scan capture sites
+
+Find every PostHog capture/identify/group SDK call in the codebase via a single `Grep` and write a base inventory. **Read-only via Grep.** Don't `Read` any source files in this step — file-level enrichment happens in step 3.
+
+This step is one Grep, one Write. No file Reads, no subagents, no MCP. Severity, flows, and identity analysis come later.
+
+## Tools
+
+Load via `ToolSearch select:Grep,Write` once at the start of this step.
+
+## Status
+
+Emit, in order:
+
+```
+[STATUS] Scanning SDK capture sites
+[STATUS] Writing base event inventory
+```
+
+## Action
+
+### a. Grep for direct SDK calls (with context)
+
+Run a single `Grep` for the standard PostHog call shapes. Use `-A 3` so multi-line capture calls are visible without opening the file. Narrow `--include` to the languages step 1 detected — don't scan `*.kt` if the project is Python.
+
+```
+Grep -rn -B 0 -A 3 -E 'posthog\??\.(capture|identify|alias|group|setPersonProperties|setPersonPropertiesForFlags|reset)|usePostHog\(\)\??\.(capture|identify)|client\??\.capture|PostHog\??\.(shared|capture)|Posthog\(\)\??\.capture'
+```
+
+The `\??\.` matches both `posthog.capture(...)` and `posthog?.capture(...)` (optional chaining). JS/TS codebases routinely guard SDK calls with `?.` when the SDK may be uninitialised — missing this pattern undercounts the inventory by half or more.
+
+Common include patterns:
+
+- Python: `--include='*.py'`
+- JS/TS web: `--include='*.ts' --include='*.tsx' --include='*.js' --include='*.jsx' --include='*.vue' --include='*.svelte' --include='*.html'`
+- Ruby: `--include='*.rb'`
+- Go: `--include='*.go'`
+- Java/Kotlin/Android: `--include='*.java' --include='*.kt'`
+- iOS/Swift: `--include='*.swift'`
+- Flutter: `--include='*.dart'`
+- C#/.NET: `--include='*.cs'`
+- Elixir: `--include='*.ex' --include='*.exs'`
+
+**Exclude test files.** Drop hits in paths matching `*.test.*`, `*.spec.*`, `__tests__/**`, `tests/**`, `spec/**`. They pollute the inventory.
+
+#### Per-SDK call signatures (covered by the regex above)
+
+Canonical reference for what a PostHog capture call looks like in each SDK. The grep regex above is a union of these shapes; step 3 subagents also use this table to find `event_name` and `properties` slots when extracting (they `Read` this file once at start).
+
+| SDK | Capture pattern | Event-name position | Properties position |
+|-----|-----------------|---------------------|---------------------|
+| posthog-js | `posthog.capture("event", { props })` | positional 1 | positional 2 (object literal) |
+| posthog-js (hook) | `usePostHog().capture("event", { props })` | positional 1 | positional 2 |
+| posthog-node | `client.capture({ distinctId, event, properties })` | object key `event` | object key `properties` |
+| posthog-python | `posthog.capture(distinct_id, "event", properties)` | positional 2 | positional 3 (dict) |
+| posthog-ruby | `posthog.capture({ distinct_id:, event:, properties: })` | hash key `event` | hash key `properties` |
+| posthog-go | `client.Enqueue(posthog.Capture{Event: "...", Properties: posthog.NewProperties()...})` | struct field `Event` | struct field `Properties` |
+| posthog-ios | `PostHog.shared.capture("event", properties: ["k": "v"])` | positional 1 | named `properties` |
+| posthog-android | `PostHog.capture("event", properties = mapOf("k" to "v"))` | positional 1 | named `properties` |
+| posthog-react-native | Same shape as posthog-js | positional 1 | positional 2 |
+| posthog-flutter | `Posthog().capture(eventName: "...", properties: { ... })` | named `eventName` | named `properties` |
+| posthog-php | `PostHog::capture(['distinctId' => ..., 'event' => '...', 'properties' => [...]])` | array key `event` | array key `properties` |
+| posthog-dotnet | `client.Capture(distinctId, "event", new() { ["k"] = "v" })` | positional 2 | positional 3 |
+| posthog-elixir | `Posthog.capture("event", distinct_id, %{ k: v })` | positional 1 | positional 3 |
+
+If the result is empty:
+
+- And the project's manifest had a PostHog SDK in step 1 → the codebase likely wraps the SDK behind a custom helper. Write `{ "rows": [], "wrapper_undetected": true }` to `.posthog-events-inventory.json` and skip the rest of this step (move on to step 3, which will short-circuit on empty rows). The data-quality check in the report step will flag this.
+- And no SDK was in the manifest either → emit `[ABORT] No capture call sites found in any detected SDK`.
+
+### b. Parse grep output into row groups
+
+`Grep -A 3` emits one trigger line plus up to three following lines per match, separated by `--` divider lines (when running across files) or contiguous when matches are adjacent. For each match:
+
+- The trigger line is `path:line:content` — the `.capture(` / `.identify(` / etc. site.
+- The following 0–3 lines are continuations from the same file.
+- Group them as a "slice" — the trigger line plus its trailing context lines.
+
+The slice is what you reason about in step (c). You don't need to re-grep or open the file.
+
+### c. Build base rows
+
+For each grouped slice, build one row:
+
+```jsonc
+{
+  "id": "capture-<short-file-slug>-<line>",
+  "file": "src/checkout/Checkout.tsx",
+  "line": 88,
+  "raw_match": "<the trigger line + up to 3 continuation lines, joined by \\n>",
+  "event_name": "purchase_completed",
+  "is_dynamic": false
+}
+```
+
+`event_name` resolution rule: extract the **first quoted string literal** (single, double, or backtick-quoted) found anywhere in the slice. If the first non-whitespace argument inside the parentheses is a quoted literal, take it. Otherwise:
+
+- The slice contains a quoted literal but it's clearly a property value (e.g. `{ revenue: "USD" }`) and not the event name → keep scanning forward to find the event-name slot, or fall through to dynamic.
+- The slice contains no quoted literal at all → set `event_name: null`, `is_dynamic: true`. Step 3's subagents will retry via Pattern A/B (same-file constant / enum) when they read the file.
+- The argument is a template literal (`` `name_${...}` ``), variable, or expression → set `event_name: null`, `is_dynamic: true`.
+
+**Don't try to be clever.** If the slice doesn't make the literal obvious, leave it dynamic — step 3 has the file open and will resolve what it can.
+
+Skip `$pageview` and `$pageleave` matches entirely — they're SDK-internal in most setups. Drop those rows; they don't go into the inventory.
+
+### d. Write the base inventory
+
+`Write` `.posthog-events-inventory.json` with the rows:
+
+```jsonc
+{
+  "rows": [ <base rows> ],
+  "wrapper_undetected": false
+}
+```
+
+This file is small (~80 bytes per row × 100 rows ≈ 8KB) so the Write fits in one turn easily.
+
+## Notes on wrapper resolution
+
+This step intentionally does **not** chase wrapper functions (`trackEvent`, `analytics.track`, etc.). Cross-file wrapper resolution doesn't fit cleanly in row-range subagent fan-out, and the reframing principle is "let the reader ask follow-ups."
+
+If `wrapper_undetected: true` (SDK in deps but no direct calls found), the report step's data-quality check surfaces it, and the suggested-follow-ups list points the reader at: *"find calls to `trackEvent`/`logEvent`/`analytics.track` and resolve their callers as additional capture sites."*
diff --git a/transformation-config/skills/events-audit/references/3-enrich-reference.md b/transformation-config/skills/events-audit/references/3-enrich-reference.md
@@ -0,0 +1,85 @@
+# Step 3 enrichment reference
+
+Lookup tables and rules subagents apply during step 3 enrichment. Read this file **once** at the start of your enrichment run.
+
+This file is supporting material for step 3; it has no `next_step` and is not part of the main step chain. The orchestrator does not read it.
+
+The per-SDK capture call signatures (where `event_name` and `properties` live in each SDK's call shape) are in `2-scan.md` under "Per-SDK call signatures". Read that section once at the start of your enrichment run alongside this file — you'll need it to extract `event_name` and `properties`.
+
+## Identification surfaces
+
+Set `call_kind` according to the call:
+
+- `posthog.identify(distinctId, $set, $set_once)` → `identify`
+- `posthog.setPersonProperties({ ... })` → `set`
+- `posthog.setPersonPropertiesForFlags` → `set_once`
+- `posthog.group(type, key, properties)` → `group`
+- `posthog.alias(alias, distinctId)` → `alias`
+- `posthog.reset()` → `reset` (no event name; the identity check uses presence to score cross-device hygiene)
+
+## `package` rules (monorepo dimension)
+
+Compute `package` **before** `area` from the file path. Match the first prefix below; everything after the prefix's package segment is what `area` rules then operate on.
+
+| Path prefix | `package` |
+|---|---|
+| `apps/<name>/...` | `<name>` |
+| `packages/<name>/...` | `<name>` |
+| `services/<name>/...` | `<name>` |
+| `projects/<name>/...` | `<name>` |
+| Anything else | `null` |
+
+Examples:
+- `apps/web/components/Checkout/Checkout.tsx` → `package: "web"`, then `area` rules see `components/Checkout/Checkout.tsx`.
+- `packages/sdk/src/track.ts` → `package: "sdk"`, then `area` rules see `src/track.ts`.
+- `src/checkout/Checkout.tsx` → `package: null`, `area` rules see the original path.
+
+Don't fabricate a package from `src/` or `app/` — those are within-package directories, not package roots.
+
+## `area` rules
+
+After `package` extraction, strip one leading `src/`, `app/`, or `pages/` from the remaining path. Then apply the first matching rule:
+
+| Path shape after stripping | `area` |
+|---|---|
+| `app/<x>/...` (Next.js app router) | `<x>` |
+| `pages/<x>/...` (Next.js pages router) | `<x>` (use `api/<seg>` for `pages/api/<seg>/...`) |
+| `components/<x>/...` | `<x>` |
+| `features/<x>/...` | `<x>` |
+| `screens/<x>/...` | `<x>` (mobile) |
+| `routes/<x>/...`, `views/<x>/...`, `controllers/<x>/...` (backend) | `<x>` |
+| `hooks/...`, `lib/...`, `utils/...`, `analytics/...`, `services/...`, `helpers/...` | `shared` |
+| `app/layout.tsx`, `app/template.tsx`, `_app.tsx`, `_document.tsx`, `app/error.tsx`, `app/not-found.tsx` | `global` |
+| Anything else | first path segment after stripping, lowercased |
+
+Strip only the first matching prefix.
+
+## `route` rules (Next.js only)
+
+- `app/foo/page.tsx` → `/foo`
+- `app/foo/bar/page.tsx` → `/foo/bar`
+- `app/foo/[id]/page.tsx` → `/foo/[id]`
+- `app/(group)/foo/page.tsx` → `/foo` (route groups in parens are ignored)
+- `pages/foo.tsx` → `/foo`
+- `pages/foo/[id].tsx` → `/foo/[id]`
+- `pages/api/<rest>` → `/api/<rest>` (without the file extension)
+
+Set `route: null` for any path that isn't router-shaped. Don't fabricate routes for non-Next.js codebases.
+
+## `enclosing` rules
+
+Backward-scan from the capture line. Match these patterns (first match wins above the capture line):
+
+- `function (\w+)\(` (named function)
+- `const (\w+) = \(?` / `const (\w+) = async`
+- `export (?:default )?function (\w+)\(`
+- `export const (\w+) = `
+- `class (\w+)`
+- `def (\w+)\(` (Python)
+- `func (\w+)\(` (Go / Swift)
+- `fun (\w+)\(` (Kotlin)
+- `def (\w+)` (Ruby)
+
+Take the closest match above the capture line at column 0 or one indent level deeper than the capture's expected wrapper. If nothing matches within ~80 lines above, set `enclosing: null`. Don't read more file context to chase it.
+
+For unnamed default exports (`export default function () { ... }`), use the file's basename without extension as the enclosing name (e.g. `CheckoutPage`).