Single source of truth for development progress, file ownership, and task tracking. Check items off as you complete them. If swapping roles mid-sprint, the incoming dev reads the checked/unchecked state to know exactly where things stand.
Every source file has exactly one owner. No file is touched by more than one developer during a given phase. Dependencies between developers are managed through interface contracts defined upfront.
| Current file | Planned location (after 3C reorg) | Owner |
|---|---|---|
src/docbot/cli.py |
cli.py (stays at top) |
Dev A |
src/docbot/models.py |
models.py (stays at top) |
Dev A |
src/docbot/llm.py |
llm.py (stays at top) |
Dev A |
src/docbot/__init__.py |
__init__.py (stays at top) |
Dev A |
pyproject.toml |
(root) | Dev A |
src/docbot/project.py |
git/project.py |
Dev A |
src/docbot/scanner.py |
pipeline/scanner.py |
Dev B |
src/docbot/orchestrator.py |
pipeline/orchestrator.py |
Dev B |
src/docbot/git_utils.py |
git/utils.py |
Dev B |
src/docbot/hooks.py |
git/hooks.py |
Dev B |
src/docbot/extractors/* |
extractors/* (already a package) |
Dev B |
src/docbot/explorer.py |
pipeline/explorer.py |
Dev B |
src/docbot/search.py |
web/search.py |
Dev B |
src/docbot/planner.py |
pipeline/planner.py |
Dev C |
src/docbot/reducer.py |
pipeline/reducer.py |
Dev C |
src/docbot/renderer.py |
pipeline/renderer.py |
Dev C |
src/docbot/tracker.py |
pipeline/tracker.py |
Dev C |
src/docbot/server.py |
web/server.py |
Dev C |
src/docbot/viz_server.py |
viz/viz_server.py |
Dev C |
src/docbot/_viz_html.py |
viz/_viz_html.py |
Dev C |
src/docbot/mock_viz.py |
viz/mock_viz.py |
Dev C |
webapp/* |
webapp/* |
Dev D |
tests/* |
tests/* |
Dev B |
New files (planned):
| Planned file | Owner | Phase |
|---|---|---|
src/docbot/git/history.py |
Dev B | 3D |
src/docbot/git/diff.py |
Dev B | 3E |
Agent exploration files (LangGraph refactor):
| File | Owner | Status |
|---|---|---|
src/docbot/exploration/__init__.py |
Dev B | Complete |
src/docbot/exploration/graph.py |
Dev B | Complete |
src/docbot/exploration/tools.py |
Dev B | Complete |
src/docbot/exploration/store.py |
Dev B | Complete |
src/docbot/exploration/prompts.py |
Dev B | Complete |
src/docbot/exploration/callbacks.py |
Dev B | Complete |
webapp/src/features/exploration/AgentExplorer.tsx |
Dev D | Complete |
webapp/src/features/exploration/AgentDetail.tsx |
Dev D | Complete |
webapp/src/features/exploration/NotepadViewer.tsx |
Dev D | Complete |
webapp/src/features/exploration/useAgentStream.ts |
Dev D | Complete |
webapp/src/features/exploration/types.ts |
Dev D | Complete |
docs/AGENT_ARCHITECTURE.md |
Dev B | Complete |
docs/MIGRATION_NOTES.md |
Dev B | Complete |
tests/test_exploration_graph.py |
Dev B | Complete |
All items complete. Tree-sitter + LLM fallback extraction implemented across Python, TypeScript, JavaScript, Go, Rust, Java, Kotlin, C#, Swift, Ruby. Scanner generalized, explorer refactored, planner/reducer/renderer prompts updated for dynamic language info, CLI/orchestrator wired.
Expand Phase 1 checklist (all checked)
- Add
SourceFilemodel,FileExtractionmodel - Update
ScanResult,ScopeResult,DocsIndexwith language fields - Define
Extractorprotocol, review and merge
- Scanner generalization (LANGUAGE_EXTENSIONS, entrypoint/package detection, SKIP_DIRS)
- LLM client review, pyproject.toml deps, exports update
- Webapp server skeleton (FastAPI with /api/index, /api/scopes, /api/graph, /api/search, /api/files, /api/fs)
- Extractors package (base.py, python_extractor.py, treesitter_extractor.py, llm_extractor.py)
- Explorer refactor (remove AST code, use get_extractor())
- Semantic search (SearchIndex class)
- Planner updates (crosscutting patterns, dynamic language prompts)
- Reducer updates (generalized edge computation, dynamic language prompts)
- Renderer updates (dynamic language prompts/templates)
- React SPA scaffold (Vite + React + TypeScript + Tailwind)
- Interactive system graph (ReactFlow), chat panel, code viewer, documentation browser
All items complete. FastAPI backend serves analyzed data + AI chat. React frontend with interactive
graph, chat panel, code viewer, guided tours, documentation browser. docbot serve launches the
full experience.
Expand Phase 2 checklist (all checked)
- Orchestrator adapted to source_files, languages pass-through
- Server completion (source endpoint, search, chat, tours)
- CLI updates (help text, serve subcommand, --no-llm behavior)
- Additional tree-sitter grammars (Kotlin, C#, Swift, Ruby)
- Test suite (test_python_extractor, test_treesitter_extractor, test_llm_extractor, test_scanner, test_explorer)
- Serve static files from webapp/dist/
- Switch from mocks to real API endpoints
- End-to-end testing, polish loading states
- Legacy viz integration decision (marked as legacy)
Goal: Transform docbot from a standalone doc generator into a git-aware CLI tool with persistent
.docbot/project directory, incremental updates based on git diffs, documentation history with snapshots, before/after comparison, git lifecycle hooks, and a change-aware webapp.
Design decisions: CWD default (optional path override), only config.toml git-tracked, init and generate are separate commands, git hooks opt-in via
docbot hook install, last N snapshots for history (configurable, default 10), both explicitdocbot update+ optional hooks (post-commit and post-merge), change-aware chat via context injection into default/api/chat+ dedicated/api/changesendpoint.
Owner: Dev A (CLI, models, project), Dev B (git_utils, hooks, scanner)
- Add
ProjectStatemodel:-
last_commit: str | None-- git commit hash at last generate/update -
last_run_id: str | None-- most recent run ID -
last_run_at: str | None-- ISO timestamp of last run -
scope_file_map: dict[str, list[str]]-- scope_id -> repo-relative file paths
-
- Add
DocbotConfigmodel:-
model: str(default from llm.py) -
concurrency: int = 4 -
timeout: float = 120.0 -
max_scopes: int = 20 -
no_llm: bool = False
-
- Implement
init_project(path):- Validate path is a git repo (check
.git/exists) - Create
.docbot/directory with subdirs (docs/,docs/modules/,scopes/,history/) - Write default
config.toml - Write
.gitignorethat ignores everything exceptconfig.tomland.gitignore
- Validate path is a git repo (check
- Implement
find_docbot_root(start):- Walk start and parents looking for
.docbot/directory - Return the parent of
.docbot/(the project root), or None
- Walk start and parents looking for
- Implement
load_config(docbot_dir)/save_config(docbot_dir, config):- TOML reading via
tomllib(stdlib 3.11+) - Simple string formatting for writing
- TOML reading via
- Implement
load_state(docbot_dir)/save_state(docbot_dir, state):- JSON via Pydantic
model_dump_json()/model_validate_json()
- JSON via Pydantic
-
init [path]command -
generate [path]command (callsrun_asynccurrently; will callgenerate_asyncafter 3B) -
updatecommand (stub -- falls back to full generate; will callupdate_asyncafter 3B) -
statuscommand (shows last commit, changed files, affected scopes) -
config [key] [value]command (view all / get one / set one) -
hook install/hook uninstallsubcommands -
serve [path]adapted to default to.docbot/viafind_docbot_root() -
runkept as hidden alias forgenerate
-
get_current_commit(repo_root)--git rev-parse HEAD -
get_changed_files(repo_root, since_commit)--git diff --name-only -
is_commit_reachable(repo_root, commit)--git cat-file -t -
get_repo_root(start)--git rev-parse --show-toplevel
-
install_hook(repo_root)-- post-commit hook with sentinel comments -
uninstall_hook(repo_root)-- remove docbot section, delete if empty
- Add
".docbot"toSKIP_DIRS
Owner: Dev B (orchestrator, git integration), Dev C (renderer refactor)
Depends on: 3A (complete)
- Extract pipeline stage helpers from
run_async():-
_run_scan(repo_path, tracker)-> ScanResult -
_run_plan(scan, max_scopes, llm_client, tracker)-> list[ScopePlan] -
_run_explore(plans, repo_path, sem, timeout, llm_client, tracker)-> list[ScopeResult] -
_run_reduce(scope_results, repo_path, llm_client, tracker)-> DocsIndex -
_run_render(docs_index, output_dir, llm_client, tracker)-> list[Path]
-
- Refactor
run_async()to call extracted helpers (no behavior change) - Implement
generate_async(docbot_root, config, llm_client, tracker):- Infer
repo_path = docbot_root.parent - Run full 5-stage pipeline, output to
docbot_root - Save plan.json, per-scope results, docs_index.json
- Build scope_file_map, call
save_state()with current commit - Save RunMeta to history/
- Infer
- Implement
update_async(docbot_root, config, llm_client, tracker):- Load state, validate last_commit via
is_commit_reachable() - Fall back to
generate_async()if commit unreachable - Get changed files, map to affected scopes via scope_file_map
- Handle new unscoped files (assign to nearest scope by directory)
- If >50% scopes affected, print suggestion to run
generateinstead - Re-explore affected scopes, load cached results for unaffected
- Merge and re-run REDUCE
- Call selective renderer functions (Dev C)
- Update state.json and save run history
- Load state, validate last_commit via
- Extract
render_scope_doc(scope, index, out_dir, llm_client)-- single scope markdown - Extract
render_readme(index, out_dir, llm_client)-- README.generated.md - Extract
render_architecture(index, out_dir, llm_client)-- architecture.generated.md - Extract
render_api_reference(index, out_dir)-- api.generated.md (template-only) - Extract
render_html_report(index, out_dir)-- index.html - Refactor
render()andrender_with_llm()to call individual functions (no behavior change)
- Update
generatecommand to callgenerate_async()instead ofrun_async() - Update
updatecommand to callupdate_async()instead of falling back to generate
-
run_async()still works identically after refactor (backward compat) -
generate_async()produces same output asrun_async()but writes to.docbot/ -
generate_async()writes correct state.json with commit hash and scope_file_map -
update_async()only re-explores affected scopes -
update_async()falls back to generate when state is invalid - Individual render functions work standalone
Owner: All devs (coordinated, touching only owned files)
Depends on: 3B (complete)
Move from 20 flat files in src/docbot/ to organized packages.
- Create
src/docbot/pipeline/package:- Move
scanner.py,planner.py,explorer.py,reducer.py,renderer.py,orchestrator.py,tracker.py
- Move
- Create
src/docbot/git/package:- Move
git_utils.py->git/utils.py - Move
hooks.py->git/hooks.py - Move
project.py->git/project.py
- Move
- Create
src/docbot/web/package:- Move
server.py->web/server.py - Move
search.py->web/search.py
- Move
- Create
src/docbot/viz/package:- Move
viz_server.py,_viz_html.py,mock_viz.py
- Move
- Keep at top level:
cli.py,models.py,llm.py,__init__.py
- Update all internal imports across the codebase
- Update
cli.pyimports to use new package paths - Update
pyproject.tomlentry points if needed - Verify no import errors across the package
Owner: Dev A (models), Dev B (history management)
Depends on: 3B (complete -- needs generate_async/update_async to hook into)
- Add
DocSnapshotmodel:-
commit_hash: str-- git commit at snapshot time -
run_id: strandtimestamp: str -
scope_summaries: dict[str, ScopeSummary]-- scope_id -> { file_count, symbol_count, summary_hash } -
graph_digest: str-- hash of dependency graph edges -
doc_hashes: dict[str, str]-- doc filename -> content hash -
stats: SnapshotStats-- total files, scopes, symbols, edges
-
- Add
max_snapshots: int = 10field toDocbotConfig
-
save_snapshot(docbot_dir, docs_index, scope_results, run_id, commit)-- create DocSnapshot + save scope results -
load_snapshot(docbot_dir, run_id)-- load a specific snapshot -
list_snapshots(docbot_dir)-- list available snapshots with metadata -
prune_snapshots(docbot_dir, max_count)-- remove oldest beyond limit - Snapshot storage:
.docbot/history/<run_id>.json(metadata) +.docbot/history/<run_id>/(scope results)
- Hook
save_snapshot()intogenerate_async()after state save - Hook
save_snapshot()intoupdate_async()after state save - Call
prune_snapshots()after each save
Owner: Dev A (CLI), Dev B (diff logic, models)
Depends on: 3D (complete -- needs snapshots to compare)
- Add
ScopeModificationmodel:-
scope_id: str -
added_files: list[str],removed_files: list[str] -
added_symbols: list[str],removed_symbols: list[str] -
summary_changed: bool
-
- Add
DiffReportmodel:-
added_scopes: list[str]-- scope IDs that are new -
removed_scopes: list[str]-- scope IDs that no longer exist -
modified_scopes: list[ScopeModification] -
graph_changes: GraphDelta-- new edges, removed edges, changed nodes -
stats_delta: StatsDelta-- change in total files, scopes, symbols
-
-
compute_diff(snapshot_from, snapshot_to)-> DiffReport - Compare scope lists (added/removed/modified)
- Per modified scope: compare file lists, symbol lists, doc hashes
- Compare graph edges (added/removed)
- Compute stats deltas
- Add
docbot diff [--from <commit-or-run>] [--to <commit-or-run>]command - Defaults: --from = previous snapshot, --to = current state
- Output: human-readable summary of what changed
Owner: Dev B (hooks expansion), Dev A (CLI flags)
Depends on: 3B (complete -- needs working update_async)
- Add
install_post_merge_hook(repo_root)-- same pattern as post-commit - Update
install_hook()to install both post-commit and post-merge by default - Add
--commit-onlyflag to install only post-commit - Update
uninstall_hook()to remove from both hook files
- Update
docbot hook installto accept--commit-onlyflag - Update help text to describe post-merge behavior
-
docbot hook installcreates both post-commit and post-merge hooks -
docbot hook install --commit-onlycreates only post-commit -
docbot hook uninstallremoves all docbot hooks -
git pullwith post-merge hook triggersdocbot update
(Verified via manual code review and hook installation test)
Owner: Dev C (server endpoints), Dev D (frontend UI)
Depends on: 3D (snapshots), 3E (diff)
-
GET /api/changes-- returns DiffReport between current and previous snapshot -
GET /api/changes?from=<run_id>&to=<run_id>-- compare specific snapshots -
GET /api/history-- list available snapshots with metadata -
GET /api/history/<run_id>-- specific snapshot detail - Update
POST /api/chatsystem prompt to inject recent DiffReport when available
- Changes banner -- summary banner when changes exist since last view
- Architecture graph diff view -- overlay showing added (green), removed (red), modified (yellow) nodes/edges
- Scope diff panel -- side-by-side or inline diff of scope documentation
- Timeline view -- visual timeline of snapshots, click to compare any two
- Chat change context -- suggested questions update when changes detected
-
/api/changesreturns correct DiffReport -
/api/historylists all snapshots - Changes banner appears in webapp after an update
- Graph highlights changed nodes/edges
- Chat can answer "what changed?" questions with accurate references
Owner: Dev C (tracker, viz_server, viz HTML), Dev B (orchestrator save), Dev A (CLI command)
Depends on: 3B (needs generate_async/update_async to save events), 3D (events stored alongside snapshots)
- Add
_events: list[dict]and_start_time: floattoPipelineTracker - Record "add" event on every
add_node()call - Record "state" event on every
set_state()call - Implement
export_events()->{"run_id": ..., "total_duration": ..., "events": [...]} - Add no-op
export_events()toNoOpTracker - Add
set_run_id()method to both tracker classes
- Call
tracker.export_events()at end ofgenerate_async()andupdate_async() - Write to
.docbot/history/<run_id>/pipeline_events.json
- Implement
start_replay_server(events_path):-
GET /serves replay HTML -
GET /eventsserves recorded event log as JSON - Auto-open browser
- Blocking server with Ctrl+C shutdown
-
- Create
REPLAY_HTMLconstant (~430 lines) - JavaScript event player: virtual clock, applies events up to current time
- Play / Pause control
- Speed selector (1x, 2x, 4x, 8x)
- Timeline scrubber (click to seek)
- Step forward / back (one event at a time)
- Elapsed time display (current position / total duration)
- Same D3 radial tree rendering as live mode
- Add
docbot replay [run_id]command - Default to most recent run if no run_id given
- Start replay server + open browser
- Port configuration via
--portflag
- Live pipeline run saves
pipeline_events.jsonto history -
docbot replayopens replay of most recent run -
docbot replay <run_id>replays a specific past run - Playback controls (play/pause/speed/scrub/step) work correctly
- Replay visualization matches what the live view showed during the original run
-
NoOpTracker.export_events()returns empty data without errors - All 99 unit tests passing
After all Phase 3 sections complete:
-
docbot initcreates valid.docbot/with config.toml and .gitignore -
docbot generateruns full pipeline into.docbot/, saves state + snapshot -
git statusonly shows.docbot/config.tomlas trackable -
docbot statusshows correct state after generate - Make a code change, commit
-
docbot updateonly re-processes affected scopes, saves new snapshot -
docbot diffshows what changed between snapshots -
docbot serveloads webapp from.docbot/with changes banner - Chat answers "what changed?" questions
-
docbot hook installcreates post-commit + post-merge hooks - Committing auto-triggers
docbot updatevia post-commit hook -
git pullauto-triggersdocbot updatevia post-merge hook -
docbot hook uninstallremoves all hooks cleanly -
docbot replayopens replay of most recent pipeline run -
docbot replay <run_id>replays a specific past run with full playback controls -
docbot runworks as alias for generate -
docbot configread/write works - Test on a Python project (regression)
- Test on a TypeScript project
- Test on a mixed-language project
If a developer needs to take over another's work mid-sprint:
- Read their checklist above -- checked items are done, unchecked items remain
- Check out their branch -- all their work-in-progress is there
- Only touch their owned files -- the file ownership table above is the source of truth
- Update this checklist as you complete items