Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
8ad6c7f
Add Zustand migration plan and AI rebuttal on rewrite vs restructure
ssrihari Mar 17, 2026
801b7fe
Extract state into Zustand stores and restructure workflow into domai…
ssrihari Mar 18, 2026
123cae6
Replace WorkflowRunner class with plain functions, reorganize by domain
ssrihari Mar 18, 2026
c9cc4a1
Refactor: execute 5 structural plans + rename provenance/Source types
ssrihari Mar 18, 2026
7e84159
Add VITE_AI_THINKING env var to control reasoning effort
ssrihari Mar 18, 2026
cfd7730
Remove setSelectedId threading from store — App.tsx owns navigation
ssrihari Mar 18, 2026
8110ccb
Fix UI issues from refactoring + rewrite.md improvements
ssrihari Mar 18, 2026
c3b2b43
Support configurable LLM providers (Ollama, Cerebras, Groq, etc.)
ssrihari Mar 18, 2026
18b8acd
Split finding-components log phase into identifying + classifying
ssrihari Mar 18, 2026
3b29c56
Add percentage precision toggle (0, 1, or 2 decimal places)
ssrihari Mar 18, 2026
d8c7e72
Include static components waffle in percentage precision toggle
ssrihari Mar 18, 2026
e0d9bb8
Fix ComparisonLegend crash: pass percentPrecision to inner component
ssrihari Mar 18, 2026
e3c2d25
Make pipeline dimension-scoped: only reprocess changed dimensions
ssrihari Mar 18, 2026
23dacbb
Move group fan-out to orchestrate.ts, clean up Group type
ssrihari Mar 18, 2026
cf4a4a4
Decouple UI components from App.tsx prop drilling
ssrihari Mar 18, 2026
157c647
Separate pipeline from orchestration, parallelize Classify+Color
ssrihari Mar 18, 2026
e13e07b
Decouple WorkflowDetailModal from callback props
ssrihari Mar 18, 2026
8a90a9c
Fix all pre-existing TypeScript errors
ssrihari Mar 18, 2026
ebe6ada
Fix group dimension editing: read/write from member files
ssrihari Mar 18, 2026
27c654c
Fix applyPromptsToAll: compare component lists, not just prompts
ssrihari Mar 18, 2026
5147142
Add segment sorting and switch color prompt to hex codes
ssrihari Mar 18, 2026
f4da415
Fix race condition: classify and color overwriting each other's dimen…
ssrihari Mar 19, 2026
be71cbf
Make dimension pipeline steps idempotent, simplify applyPromptsToAll
ssrihari Mar 19, 2026
c8ad277
Fix dimension accordion expanding for all conversations at once
ssrihari Mar 20, 2026
43d43ae
Update baseline-browser-mapping to latest
ssrihari Mar 20, 2026
06dc19f
Decouple store from orchestration: remove passthrough methods
ssrihari Mar 20, 2026
30951ad
Restructure src/ into layered architecture
ssrihari Mar 20, 2026
4ea8345
Improve naming consistency and fix misplaced files
ssrihari Mar 20, 2026
e1bfac7
Move App.tsx to ui/ — all React code now lives under ui/
ssrihari Mar 20, 2026
62928df
Add architecture.md and clean up documentation
ssrihari Mar 20, 2026
ab22f72
Lift per-dimension loop from stages into pipeline
ssrihari Mar 20, 2026
f013ad5
Remove no-op self-assignments in identifyForDimension
ssrihari Mar 20, 2026
d6437d9
Fix: pipeline skipping componentisation for new files
ssrihari Mar 20, 2026
e6fde2c
Declarative pipeline with pure stages and runner
ssrihari Mar 20, 2026
f2b4573
Imperative pipeline, delete orchestrate.ts
ssrihari Mar 20, 2026
1c0dd01
Remove startFrom — stages skip themselves via idempotency
ssrihari Mar 20, 2026
daae82e
Clean up pipeline: remove redundant wrappers and double-writes
ssrihari Mar 20, 2026
887c140
Add session recording instrumentation for test data capture
ssrihari Mar 24, 2026
d654f9b
Fix parent tracking: use time containment instead of global call stack
ssrihari Mar 26, 2026
6945561
Store diffs instead of full snapshots + instrument 9 missing functions
ssrihari Mar 26, 2026
a82b7ce
Fix parent computation for identical time intervals
ssrihari Mar 26, 2026
ca212ff
Fix parent computation for triple-identical intervals (A→B→C chains)
ssrihari Mar 26, 2026
c729de4
Add test-writing guide for session recordings
ssrihari Mar 26, 2026
4768248
Update test guide with reprocessing, applyPromptsToAll, grouping, and…
ssrihari Mar 26, 2026
fbf79d9
Add comprehensive test suite using session recordings as ground truth
ssrihari Mar 26, 2026
e5edb43
Add test run command to Quick Start
ssrihari Mar 26, 2026
24dc788
Add CI test workflow and require passing tests for PR merges
ssrihari Mar 26, 2026
d38c15f
Fix CI: copy test inputs to fixtures, remove sample-logs dependency
ssrihari Mar 26, 2026
66d2770
Add behavioural specification derived from test assertions
ssrihari Mar 26, 2026
f7b7db6
Add core extraction lessons from tracer bullet experiment
ssrihari Mar 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,17 @@ VITE_AI_API_KEY=your-openai-api-key
# RECOMMENDED: gpt-4o-mini (fast and cheap)
# NOT RECOMMENDED: o1-preview, o1-mini, gpt-5 (reasoning models are 10-20x slower and not necessary)
VITE_AI_MODEL=gpt-4o-mini

# Optional: Base URL for OpenAI-compatible API providers
# For local Ollama: http://localhost:11434/v1
# For Groq: https://api.groq.com/openai/v1
# VITE_AI_BASE_URL=http://localhost:11434/v1

# API mode: "responses" (default, OpenAI Responses API) or "chat" (Chat Completions API)
# Use "chat" for non-OpenAI providers (Ollama, Cerebras, Groq, Together, Fireworks, etc.)
# VITE_AI_API_MODE=chat

# Control reasoning/thinking for OpenAI reasoning models (gpt-5 series, o-series)
# Values: none (off), low, medium, high
# Omit or leave empty for model default behavior
# VITE_AI_THINKING=none
16 changes: 16 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: Tests

on:
pull_request:
branches: [main]
push:
branches: [main]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v1
- run: bun install
- run: bun run test
25 changes: 22 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,9 @@ cp .env.example .env

# Start the development server
bun run dev

# Run tests
npx vitest run
```

### Environment Configuration
Expand All @@ -56,8 +59,23 @@ Create a `.env` file based on `.env.example`:
# AI API Configuration for Semantic Segmentation
VITE_AI_API_KEY=your-openai-api-key
VITE_AI_MODEL=gpt-4o-mini # Optional, defaults to gpt-4o-mini

# Optional: Use a different provider (Ollama, Cerebras, Groq, etc.)
VITE_AI_BASE_URL=http://localhost:11434/v1 # e.g. Ollama
VITE_AI_API_MODE=chat # "chat" for non-OpenAI providers, "responses" (default) for OpenAI
```

#### Alternative providers

Any OpenAI-compatible API works. Set `VITE_AI_API_MODE=chat` for non-OpenAI providers.

| Provider | Base URL | Example model | Notes |
|----------|----------|---------------|-------|
| **Ollama** (local) | `http://localhost:11434/v1` | `gemma3:1b` | Free, no API key needed (set any value) |
| **Cerebras** | `https://api.cerebras.ai/v1` | `llama3.1-8b` | Free tier: 24M tokens/day, very fast |
| **Groq** | `https://api.groq.com/openai/v1` | `llama-3.1-8b-instant` | ~$0.06/1M tokens, fast |
| **OpenAI** (default) | _(not needed)_ | `gpt-4o-mini` | Uses Responses API by default |

## Documentation

- [System overview](docs/system-overview.md) — data model, processing pipeline, visualizations, and interactive workflow
Expand All @@ -83,9 +101,10 @@ evolving, still. To begin with, it will support the completions and
responses API formats. They're implemented behind an interface so it's
easy to add another format's parser.

Currently this tool only supports open-ai as the LLM provider, but the
idea is to be fully model and format agnostic. It uses vercel's AI
SDK, so it should be easy enough to add support for other providers.
This tool supports any OpenAI-compatible LLM provider, including local
models via Ollama. It uses Vercel's AI SDK with a configurable base
URL and API mode, so you can use OpenAI, Cerebras, Groq, or run a
small model like Gemma 3 1B locally.

## License

Expand Down
17 changes: 11 additions & 6 deletions docs/CAPABILITIES.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,13 +225,18 @@ Then:

## File Locations

See [architecture.md](./architecture.md) for the full directory structure.

| What | Where |
|------|-------|
| Prompts | `src/prompts.ts` |
| Data model & types | `src/model/` |
| Prompts | `src/stages/ai/prompts.ts` |
| Parsers | `src/parsers/` |
| Components | `src/components/` |
| Schemas | `src/schema.ts`, `src/input-schemas.ts` |
| AI logic | `src/componentisation.ts`, `src/ai-summary.ts`, `src/segmentation.ts` |
| Pipeline stages | `src/stages/` |
| Orchestration | `src/pipeline/` |
| State management | `src/stores/` |
| UI components | `src/ui/components/` |
| Schemas | `src/model/schema.ts`, `src/parsers/input-schemas.ts` |

---

Expand All @@ -240,10 +245,10 @@ Then:
To add a new input format:

1. Create `src/parsers/your-format-parser.ts`
2. Implement `Parser` interface with `canParse()` and `parse()` methods
2. Implement `Parser` interface (from `src/model/types.ts`) with `canParse()` and `parse()` methods
3. Register in `src/parsers/index.ts`

To modify component identification:

1. Edit default prompt in `src/prompts.ts` (`getDefaultComponentIdentificationPrompt`)
1. Edit default prompt in `src/stages/ai/prompts.ts` (`getDefaultComponentIdentificationPrompt`)
2. Or use the UI prompt editor for per-session changes
228 changes: 228 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
# Architecture

Context Viewer analyzes AI conversation logs — breaking them into semantic
components, counting tokens, and visualizing how context is distributed.
Everything runs in the browser; data stays local unless explicitly sent to
an AI API.

## Directory structure

```
src/
├── model/ Data definitions — the nouns
├── operations/ Pure transforms over the model — the verbs
├── parsers/ Pluggable input format adapters
├── stages/ Processing pipeline stages
│ └── ai/ AI infrastructure (config, prompts, logging)
├── pipeline/ Orchestration — sequences stages, manages lifecycle
├── stores/ Zustand state management
├── ui/ React components, hooks, and UI utilities
│ ├── App.tsx Application shell (root composition)
│ ├── components/ React components + shadcn primitives
│ ├── hooks/ React hooks
│ └── lib/ UI utilities (Tailwind helpers, color lookups, etc.)
├── lib/ Generic utilities (id-generator)
└── main.tsx Entry point
```

### Dependency rule

Each layer only imports from layers below it. No upward or circular
dependencies.

```
model/ → nothing (zod only)
operations/ → model/
parsers/ → model/
stages/ → model/ + operations/ + stages/ai/ + AI SDK
pipeline/ → model/ + stages/
stores/ → model/ + operations/ + pipeline/ + parsers/
ui/ → stores/ + model/ + operations/
```

## Layers

### model/

The data definitions everything else is built on. No logic, no side
effects, no I/O.

| File | Contents |
|------|----------|
| `schema.ts` | `Message`, `Conversation`, `Part` — Zod-validated types for the standard conversation format |
| `types.ts` | `PipelineState`, `Group`, `DimensionData`, `Stage`, `StageGroup`, `PipelineStep`, `ConversationMetadata`, `ConversationSummary`, `ComponentTimelineSnapshot`, and other core domain types |
| `dimensions.ts` | Dimension accessor helpers: `ensureDimensions`, `getDimension`, `getEffectiveComponents`, `getAllComponents` |
| `export-schema.ts` | Zod schemas for the JSON export format (`FileExport`, `SessionExport`) |
| `presets.ts` | `PresetConfig`, `PresetSummary` type definitions |

**Key types:**

- **`PipelineState`** — the central type. Represents a conversation file
being processed: its identity, lifecycle status, parsed data, dimensions,
static components, prompts, and timing info.
- **`Group`** — lightweight metadata referencing member files by ID. Groups
don't concatenate conversations; the UI reconstructs a virtual view.
- **`DimensionData`** — one categorization scheme with `discoveredComponents`
(AI-found), `customComponents` (user-provided), mapping, timeline, and
colors. Use `getEffectiveComponents()` to get the active list.
- **`Stage`** — granular execution units (`"parsing"`, `"identifying-components"`,
etc.). Maps 1:1 to files in `stages/`.
- **`StageGroup`** — coarser UI checkpoints. `"finding-components"` groups
identify + classify + color.
- **`PipelineStep`** — ordered enum for pipeline resumption points
(`Parse=0` through `Color=5`).

### operations/

Pure functions over model types. No AI calls, no I/O, no state. Given data
in, return data out.

| File | What it does |
|------|-------------|
| `aggregation.ts` | Token aggregation by component, timeline building, tuple computation for multi-dimension analysis, CSV generation |
| `conversation-summary.ts` | Computes message/role/part-type stats from a `Conversation` |
| `token-counting.ts` | Adds `token_count` to every message part using tiktoken (GPT-4 encoding) |
| `static-components.ts` | Deterministic componentization by `role.partType` (no AI needed) |
| `message-filters.ts` | Predicate-based filtering of messages/parts by role and type |
| `export-builder.ts` | Builds `FileExport` and `SessionExport` JSON structures (pure data — no download I/O) |
| `color-math.ts` | Hex/RGB conversion, lighten/darken/blend — pure math, no Tailwind |

### parsers/

Each parser converts one external conversation format into the standard
`Conversation` schema. Adding a new format = one new file + register it.

| File | Format |
|------|--------|
| `claude-transcripts-parser.ts` | Claude API transcripts |
| `codex-transcripts-parser.ts` | Codex CLI transcripts |
| `opencode-transcripts-parser.ts` | OpenCode agent transcripts |
| `completions-parser.ts` | OpenAI Completions API |
| `responses-parser.ts` | OpenAI Responses API |
| `conversations-parser.ts` | Generic conversation JSON |
| `trajectory-parser.ts` | Agent trajectory format |
| `swe-agent-trajectory-parser.ts` | SWE-Agent trajectories |
| `plain-text-parser.ts` | Raw text / markdown |
| `context-viewer-parser.ts` | Re-import pre-processed Context Viewer exports |

Supporting files: `parser.ts` (registry), `file-formats.ts` (format
detection), `file-import.ts` (drop input handling), `input-schemas.ts`
(Zod schemas for input formats).

### stages/

Each file is one processing stage — the algorithm and its pipeline
integration in one place. Stages that use AI depend on `stages/ai/`.

| Stage file | What it does | Uses AI? |
|-----------|-------------|----------|
| `parse.ts` | Parse file → `Conversation` + metadata. Also handles restoring pre-processed exports. | No |
| `count-tokens.ts` | Add token counts + run static componentization | No |
| `segment.ts` | Split large text parts into semantic chunks | Yes |
| `identify-components.ts` | Discover the component list for each dimension | Yes |
| `classify-components.ts` | Map every part → component, build timeline | Yes |
| `color-components.ts` | Assign hex colors to components (AI or preset) | Yes |
| `summarize.ts` | Generate streaming conversation summary | Yes |
| `analyze.ts` | Generate streaming context analysis from components + summary | Yes |

**stages/ai/** — infrastructure shared by AI-powered stages:

| File | What it does |
|------|-------------|
| `config.ts` | AI provider configuration, model creation (`getAIConfig`, `createModel`) |
| `prompts.ts` | 6 prompt templates with custom override support |
| `strip-large-content.ts` | Remove images/files, truncate tool outputs before AI calls |
| `preset-loader.ts` | Load preset JSON from server (note: this does HTTP I/O) |

### pipeline/

Orchestration: how stages get sequenced, how errors are handled, how
results get written back to the store.

| File | What it does |
|------|-------------|
| `pipeline.ts` | Step ordering and execution. Conversation-level: Parse → CountTokens → Segment. Dimension-level: Identify → (Classify + Color in parallel). Handles pre-processed imports, API key pauses, and resume. |
| `orchestrate.ts` | Higher-level operations: reprocess from a given step, apply prompts to all files, generate summary/analysis on demand, batch processing. Uses `StoreAccessor` for dependency injection. |
| `notify.ts` | Lifecycle callbacks (`startStep`, `endStep`, `markComplete`, `markFailed`) that push state updates to the store. |
| `logging.ts` | Per-conversation structured logging with pub/sub for UI display. |
| `stage-logger.ts` | Factory for loggers bound to a specific stage. |

### stores/

Zustand state management. The adapter between the non-UI pipeline world
and the React UI.

| File | What it does |
|------|-------------|
| `conversation-store.ts` | Core app state: `conversations` (PipelineState[]), `groups`, file CRUD, pipeline execution. Thin adapter around `pipeline/orchestrate`. |
| `ui-store.ts` | Transient UI state: dialog open/close, editing prompts, loaded preset, active dimensions. |
| `url-store.ts` | URL ↔ UI state sync: selected conversation, active tab, sidebar state, message filters. |
| `actions.ts` | Glue functions that wire UI intent → pipeline execution. Reads from stores, calls orchestrate functions, handles errors. |

### ui/

Everything React. Components, hooks, and UI-specific utilities.

- **`App.tsx`** — root composition: layout, dropzone, routing, store subscriptions
- **`components/`** — all React components (conversation list, message view, charts, dialogs, etc.) + shadcn primitives in `ui/`
- **`hooks/useUrlState.ts`** — sync component state to URL
- **`lib/`** — `utils.ts` (Tailwind merge), `component-colors.ts` (Tailwind class lookups), `static-component-colors.ts`, `part-type-config.ts` (labels/emoji), `url-state.ts` (URL serialization), `url-fetch.ts`, `export-download.ts` (browser download I/O)

## Processing pipeline

When a file is dropped, it flows through these stages:

```
File dropped
├─ Parse ──────────────── stages/parse.ts
│ └─ Detect format, parse → Conversation + metadata
├─ Count Tokens ───────── stages/count-tokens.ts
│ └─ tiktoken encoding + static componentization
├─ [no API key? pause here, resume later]
├─ Segment ────────────── stages/segment.ts (AI)
│ └─ Split large parts into semantic chunks
├─ Per dimension:
│ ├─ Identify ───────── stages/identify-components.ts (AI)
│ │ └─ Discover component list
│ │
│ ├─ Classify ───────── stages/classify-components.ts (AI) ─┐
│ │ └─ Map parts → components, build timeline │ parallel
│ │ │
│ └─ Color ──────────── stages/color-components.ts (AI) ─┘
│ └─ Assign hex colors
└─ Done (summary + analysis are on-demand, not in the main pipeline)
```

**Re-entry:** The pipeline can restart from any `PipelineStep`. Changing a
prompt restarts from Identify; changing segmentation restarts from Segment.
Each stage is idempotent — it skips if its inputs match its outputs.

**Pre-processed imports:** Context Viewer export files skip the entire
pipeline. The parser restores all dimensions, components, colors, and
summaries from the export metadata.

## Multi-dimensional analysis

A single conversation can be analyzed along multiple dimensions
simultaneously. Each dimension has its own:

- Identification prompt
- `discoveredComponents` list (AI-found) or `customComponents` (user-provided)
- Part-to-component mapping
- Timeline
- Colors and coloring prompt

Use `getEffectiveComponents(dim)` from `model/dimensions.ts` to get the
active component list (custom overrides discovered).

## Groups

A `Group` is a lightweight container referencing member files by ID. It
doesn't concatenate conversations — the UI reconstructs a virtual merged
view. Groups can have their own summary and analysis prompts.
File renamed without changes.
File renamed without changes.
Loading
Loading