Hmbown
diff --git a/‎README.md‎
Lines changed: 66 additions & 27 deletions b/‎README.md‎
Lines changed: 66 additions & 27 deletions
diff --git a/‎pyproject.toml‎
Lines changed: 1 addition & 1 deletion b/‎pyproject.toml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/nemocode/__init__.py‎
Lines changed: 1 addition & 1 deletion b/‎src/nemocode/__init__.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/nemocode/cli/commands/agent.py‎
Lines changed: 1 addition & 1 deletion b/‎src/nemocode/cli/commands/agent.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/nemocode/cli/commands/init_cmd.py‎
Lines changed: 1 addition & 3 deletions b/‎src/nemocode/cli/commands/init_cmd.py‎
Lines changed: 1 addition & 3 deletions
diff --git a/‎src/nemocode/cli/commands/repl.py‎
Lines changed: 35 additions & 1 deletion b/‎src/nemocode/cli/commands/repl.py‎
Lines changed: 35 additions & 1 deletion
diff --git a/‎src/nemocode/cli/render.py‎
Lines changed: 1 addition & 3 deletions b/‎src/nemocode/cli/render.py‎
Lines changed: 1 addition & 3 deletions
diff --git a/‎src/nemocode/cli/tui.py‎
Lines changed: 27 additions & 1 deletion b/‎src/nemocode/cli/tui.py‎
Lines changed: 27 additions & 1 deletion
diff --git a/‎src/nemocode/config/agents.py‎
Lines changed: 17 additions & 8 deletions b/‎src/nemocode/config/agents.py‎
Lines changed: 17 additions & 8 deletions
diff --git a/‎src/nemocode/core/first_run.py‎
Lines changed: 2 additions & 2 deletions b/‎src/nemocode/core/first_run.py‎
Lines changed: 2 additions & 2 deletions
@@ -6,54 +6,55 @@ Agentic coding CLI for [NVIDIA NIM](https://build.nvidia.com). Reads your code,
 
 ## Install
 
-From source (editable):
+From PyPI:
 ```bash
-pip install -e .
+pip install nemocode
 ```
 
-Or from PyPI:
+Or from source (editable):
 ```bash
-pip install nemocode
+pip install -e .
 ```
 
-## Setup
+## Quick Start
 
 Run the guided setup wizard:
 
 ```bash
 nemo setup
 ```
 
-It defaults to hosted NVIDIA NIM, prompts for `NVIDIA_API_KEY`, and can also configure a local `vllm` or `sglang` endpoint and model for you.
+The wizard defaults to **hosted NVIDIA NIM**, prompts for `NVIDIA_API_KEY`, and can also configure a local `vllm` or `sglang` backend for you.
+
+### Hosted NVIDIA NIM (default)
 
-If you just want hosted NIM manually, get a free API key from [build.nvidia.com](https://build.nvidia.com):
+Get a free API key from [build.nvidia.com](https://build.nvidia.com):
 
 ```bash
 export NVIDIA_API_KEY="nvapi-..."
 nemo code
 ```
 
-Hosted Nemotron/NIM endpoints in NeMoCode use `NVIDIA_API_KEY` by default.
+Hosted Nemotron endpoints use `NVIDIA_API_KEY` by default. The setup wizard can store it in your system keyring.
 
-Or serve a model locally with [vLLM](https://docs.vllm.ai/) or [SGLang](https://sgl-project.github.io/) on any NVIDIA GPU:
+### Local vLLM or SGLang
+
+Serve a model locally on any NVIDIA GPU:
 
 ```bash
 # vLLM
 vllm serve nvidia/NVIDIA-Nemotron-Nano-9B-v2 \
-  --trust-remote-code --mamba_ssm_cache_dtype float32 \
-  --enable-auto-tool-choice \
-  --tool-parser-plugin nemotron_toolcall_parser.py \
-  --tool-call-parser nemotron_json
+  --host 0.0.0.0 --port 8000
 nemo code -e local-vllm-nano9b
 
 # SGLang (best for Nemotron 3 Super long context on DGX Spark)
 python -m sglang.launch_server \
   --model nvidia/nemotron-3-super-120b-a12b \
-  --quantization nvfp4 --trust-remote-code
-nemo code -e spark-sglang-super
+  --host 0.0.0.0 --port 8000
+nemo code -e local-sglang-super
 ```
 
-No GPU? Rent one via [Brev](https://console.brev.dev) — L40S from $1.03/hr:
+No GPU? Rent one via [Brev](https://console.brev.dev):
 
 ```bash
 nemo setup brev
@@ -67,8 +68,32 @@ nemo code "fix the bug in auth.py" -y  # one-shot, auto-approve tools
 nemo chat "explain this error"         # chat, no tools
 cat log.txt | nemo code "diagnose"     # pipe input
 nemo code -f super-nano "refactor"     # multi-model formation
+nemo code --tui                        # full-screen TUI
+```
+
+## Plan Mode
+
+Plan mode is a read-only planning phase with an approval gate before execution.
+
+- **Read-only**: Plan mode only reads files, searches code, and explores — no writes, shell commands, or commits.
+- **Approval gate**: The planner proposes a concrete plan. You review and approve, revise with feedback, or cancel.
+- **Execution**: Once approved, a build agent executes the plan with full tool access.
+
+Switch modes in the REPL with Tab or `/mode`:
+
+| Mode | Behavior |
+|------|----------|
+| `code` | Ask before tool calls (default) |
+| `plan` | Read-only planning + approval gate |
+| `auto` | Auto-approve everything |
+
+Launch directly in plan mode:
+```bash
+nemo code --agent plan "implement user auth"
 ```
 
+The plan agent can also spawn read-only research subagents to help with exploration.
+
 ## Endpoints
 
 Works with any OpenAI-compatible API. Pre-configured:
@@ -78,7 +103,7 @@ Works with any OpenAI-compatible API. Pre-configured:
 | `nim-super` | Nemotron 3 Super (12B/120B MoE, 1M ctx) | NIM API key |
 | `nim-nano` | Nemotron 3 Nano (3B/30B MoE, 1M ctx) | NIM API key |
 | `nim-nano-9b` | Nemotron Nano 9B v2 | NIM API key |
-| `nim-nano-4b` | Nemotron Nano 4B v1.1 (new!) | NIM API key |
+| `nim-nano-4b` | Nemotron Nano 4B v1.1 | NIM API key |
 | `nim-vlm` | Nemotron Nano 12B VL (vision) | NIM API key |
 | `nim-embed` | Nemotron Embed 1B v2 | NIM API key |
 | `nim-rerank` | Nemotron Rerank 1B v2 | NIM API key |
@@ -105,18 +130,32 @@ nemo code -f super-nano "implement caching"
 | `vision` | VLM reads screenshots, Super writes code |
 | `local` | Nano on local GPU, no internet needed |
 
-## Agents
+## Agents & Sub-agent Orchestration
 
-NeMoCode now supports named agent profiles for top-level sessions and delegated sub-agents.
+NeMoCode supports named agent profiles for top-level sessions and delegated sub-agents.
 
-- Built-in primary agents: `build`, `plan`
-- Built-in sub-agents: `general`, `explore`, `review`, `debug`, `test`, `doc`, `code-search`, `fast`
+- **Primary agents**: `build` (default full-access), `plan` (read-only planning)
+- **Sub-agents**: `general`, `explore`, `review`, `debug`, `test`, `doc`, `code-search`, `fast`
 - Inspect them with `nemo agent ls` and `nemo agent show <name>`
 - Switch primary agents with `nemo code --agent <name>` or `/agent <name>` in the REPL/TUI
-- Sub-agent orchestration tools are now available in coding sessions: `delegate`, `spawn_agent`, `wait_agent`, `close_agent`, and `resume_agent`
-- Define custom agents in `.nemocode.yaml` under `agents:` or in markdown files under `.nemocode/agents/*.md`
 
-Example markdown agent:
+### Sub-agent tools
+
+In coding sessions, these orchestration tools are available:
+
+| Tool | Purpose |
+|------|---------|
+| `delegate` | Spawn a sub-agent and wait for the result |
+| `spawn_agent` | Spawn a background sub-agent for parallel work |
+| `wait_agent` | Wait for a spawned sub-agent to finish |
+| `close_agent` | Close or cancel a sub-agent handle |
+| `resume_agent` | Reopen a previously closed sub-agent handle |
+
+Sub-agents inherit read-only mode when delegated from plan mode.
+
+### Custom agents
+
+Define custom agents in `.nemocode.yaml` under `agents:` or as markdown files under `.nemocode/agents/*.md`:
 
 ```markdown
 ---
@@ -134,7 +173,7 @@ tools:
 Review the requested changes. Focus on correctness, regressions, and missing tests.
 ```
 
-## Local GPU setup
+## Setup Commands
 
 ```bash
 nemo setup          # guided wizard
@@ -146,7 +185,7 @@ nemo setup nim      # NIM container guide
 nemo setup brev     # rent a cloud GPU
 ```
 
-## More commands
+## More Commands
 
 ```bash
 nemo endpoint ls / test     # manage endpoints
@@ -157,7 +196,7 @@ nemo hardware recommend     # GPU-based recommendations
 nemo doctor                 # run diagnostics to check setup
 nemo session ls             # past conversations
 nemo obs pricing            # token pricing
-nemo init                   # create .nemocode.yaml without overriding your user default endpoint
+nemo init                   # create .nemocode.yaml without overriding user defaults
 ```
 
 ## Contributing
 
@@ -1,6 +1,6 @@
 [project]
 name = "nemocode"
-version = "0.1.18"
+version = "0.1.19"
 description = "Terminal-first control plane for NVIDIA Nemotron 3 — agentic coding, RAG, doc-ops, and multi-model formations."
 readme = "README.md"
 requires-python = ">=3.11"
 
@@ -3,4 +3,4 @@
 
 """NeMoCode — Terminal-first agentic coding CLI for NVIDIA Nemotron 3."""
 
-__version__ = "0.1.18"
+__version__ = "0.1.19"
@@ -9,8 +9,8 @@
 from rich.console import Console
 from rich.table import Table
 
-from nemocode.config.agents import resolve_agent_reference
 from nemocode.config import load_config
+from nemocode.config.agents import resolve_agent_reference
 from nemocode.config.schema import AgentMode
 from nemocode.core.subagents import list_runs
 from nemocode.tools.delegate import _pick_endpoint
 
@@ -34,9 +34,7 @@ def init_cmd(
     name: str = typer.Option(None, "--name", help="Project name"),
     force: bool = typer.Option(False, "--force", help="Overwrite existing config"),
     endpoint: str = typer.Option(None, "--endpoint", help="Project-specific default endpoint"),
-    formation: str = typer.Option(
-        None, "--formation", help="Project-specific active formation"
-    ),
+    formation: str = typer.Option(None, "--formation", help="Project-specific active formation"),
 ) -> None:
     """Create a .nemocode.yaml project config in the current directory."""
     config_path = Path.cwd() / ".nemocode.yaml"
 
@@ -717,7 +717,9 @@ def _cmd_agent(self, arg: str) -> bool:
         agent = primary_agents.get(resolved)
         if agent is None:
             available = ", ".join(sorted(primary_agents.keys()))
-            console.print(f"[red]Unknown primary agent: {arg}[/red]\n[dim]Available: {available}[/dim]")
+            console.print(
+                f"[red]Unknown primary agent: {arg}[/red]\n[dim]Available: {available}[/dim]"
+            )
             return True
 
         self._state.agent_name = resolved
@@ -1665,6 +1667,14 @@ async def _ask_user_interactive(question: str, options: list[str]) -> str:
                 renderer.finalize()
                 _render_status_bar(state)
 
+            # Check if plan approval is pending after the turn
+            if state.agent.has_pending_plan:
+                console.print(
+                    "\n[yellow]Plan awaiting your approval."
+                    " Reply with [bold]approve[/bold], [bold]revise[/bold],"
+                    " or [bold]cancel[/bold].[/yellow]"
+                )
+
             # Auto-save session after each turn
             _auto_save_session(state)
     finally:
@@ -1686,7 +1696,31 @@ async def _run_turn(state: _ReplState, user_input: str, renderer: _TurnRenderer)
 
     Handles Ctrl+C gracefully: sets the cancelled flag and drains remaining events
     rather than raising, so the session stays in a consistent state.
+    Also handles resume of pending plan approvals.
     """
+    # Check if we're resuming a pending plan approval
+    if state.agent.has_pending_plan:
+        result = await state.agent.try_handle_plan_response(user_input)
+        if result is not None:
+            # User input was a recognized plan decision — handle it
+            renderer.start_thinking("Processing plan decision")
+            try:
+                async for event in result:
+                    if state.cancelled:
+                        continue
+                    try:
+                        renderer.render_event(event)
+                    except KeyboardInterrupt:
+                        state.cancel()
+                        console.print("\n[dim]Cancelling...[/dim]")
+                        continue
+            finally:
+                pass
+            return
+        # Not a plan decision — clear pending state and proceed as normal input
+        state.agent._pending_plan_text = None
+        state.agent._pending_plan_user_input = None
+
     # Start a unified thinking spinner via the renderer.
     # It persists through read-only tool execution, updating with progress,
     # and stops automatically when the text response begins.
 
@@ -833,9 +833,7 @@ def summarize_delegate_result(parsed: dict) -> tuple[str, str | None] | None:
         preview = str(parsed.get("output") or "").strip()
     else:
         preview = str(
-            parsed.get("last_output_preview")
-            or parsed.get("output_preview")
-            or ""
+            parsed.get("last_output_preview") or parsed.get("output_preview") or ""
         ).strip()
     if preview:
         preview = preview.splitlines()[0].strip()
 
@@ -42,8 +42,8 @@
     summarize_delegate_result,
     tool_result_has_embedded_error,
 )
-from nemocode.config.agents import resolve_agent_reference
 from nemocode.config import load_config
+from nemocode.config.agents import resolve_agent_reference
 from nemocode.config.schema import AgentMode, NeMoCodeConfig
 from nemocode.core.context import ContextManager
 from nemocode.core.metrics import MetricsCollector, RequestMetrics
@@ -1188,6 +1188,28 @@ async def _run_agent_turn(self, user_input: str) -> None:
             self.post_message(TurnComplete(error="No agent configured"))
             return
 
+        # Check if we're resuming a pending plan approval
+        if agent.has_pending_plan:
+            result = await agent.try_handle_plan_response(user_input)
+            if result is not None:
+                try:
+                    async for event in result:
+                        if self._state.cancelled:
+                            continue
+                        self.post_message(AgentEventMessage(event))
+                except asyncio.CancelledError:
+                    self.post_message(TurnComplete(error=None))
+                    return
+                except Exception as exc:
+                    logger.exception("Plan approval handling failed")
+                    self.post_message(TurnComplete(error=str(exc)))
+                    return
+                self.post_message(TurnComplete())
+                return
+            # Not a plan decision — clear pending and proceed normally
+            agent._pending_plan_text = None
+            agent._pending_plan_user_input = None
+
         try:
             async for event in agent.run(user_input):
                 if self._state.cancelled:
@@ -1299,6 +1321,10 @@ def _on_turn_complete(self, msg: TurnComplete) -> None:
         except NoMatches:
             pass
 
+        # Check if plan approval is pending
+        if self._state.agent and self._state.agent.has_pending_plan:
+            chat.add_system("Plan awaiting your approval. Reply with approve, revise, or cancel.")
+
     def _show_turn_summary(self, chat: ChatScroll) -> None:
         """Show a compact performance summary after a turn completes."""
         s = self._state
 
@@ -5,8 +5,8 @@
 
 from __future__ import annotations
 
-from copy import deepcopy
 import os
+from copy import deepcopy
 from pathlib import Path
 from typing import TYPE_CHECKING, Any
 
@@ -47,17 +47,26 @@
         "mode": "primary",
         "role": "planner",
         "tools": [
-            "fs_read", "git_read", "rg", "glob", "clarify",
-            "delegate", "spawn_agent", "wait_agent", "close_agent", "resume_agent"
+            "fs_read",
+            "git_read",
+            "rg",
+            "glob",
+            "clarify",
+            "delegate",
+            "spawn_agent",
+            "wait_agent",
+            "close_agent",
+            "resume_agent",
         ],
         "prompt": (
             "You are NeMoCode in plan mode. Read the codebase, analyze the task, "
             "and propose concrete next steps without modifying files or running "
-            "destructive commands. Use ask_user / ask_clarify only when you are blocked on missing requirements. "
-            "You can also spawn read-only research subagents to help with exploration. "
-            "When you have a plan, present it clearly for approval. The controller will handle approval, revision, "
-            "or cancellation after you respond. If the plan is revised, incorporate the user's feedback and return "
-            "only the updated plan."
+            "destructive commands. Use ask_clarify only when blocked on requirements. "
+            "You can also spawn read-only research subagents for exploration. "
+            "When you have a plan, present it clearly for approval. "
+            "The controller handles approval, revision, or cancellation after you "
+            "respond. If revised, incorporate the feedback and return only the "
+            "updated plan."
         ),
     },
     "general": {
 
@@ -10,13 +10,13 @@
 from __future__ import annotations
 
 import os
-from pathlib import Path
 import sys
+from pathlib import Path
 
+import typer
 from rich.console import Console
 from rich.panel import Panel
 from rich.table import Table
-import typer
 
 from nemocode import __version__
 from nemocode.config import ensure_config_dir
Original file line number	Diff line number	Diff line change
`@@ -3,4 +3,4 @@`
`3`	`3`
`4`	`4`	`"""NeMoCode — Terminal-first agentic coding CLI for NVIDIA Nemotron 3."""`
`5`	`5`
`6`		`-__version__ = "0.1.18"`
	`6`	`+__version__ = "0.1.19"`