Skip to content

feat(knowledge): add branch retrieval plane with native QMD support#368

Open
vsumner wants to merge 4 commits intospacedriveapp:mainfrom
vsumner:feat/f1-qmd-retrieval-slice
Open

feat(knowledge): add branch retrieval plane with native QMD support#368
vsumner wants to merge 4 commits intospacedriveapp:mainfrom
vsumner:feat/f1-qmd-retrieval-slice

Conversation

@vsumner
Copy link
Contributor

@vsumner vsumner commented Mar 8, 2026

Summary

This PR adds the first native knowledge retrieval plane for branches.

Branches can now call knowledge_recall, route queries through a normalized retrieval service, and pull external knowledge from QMD without exposing raw MCP tooling to the channel surface.

It also lays the worker/task-preset groundwork needed to route retrieval work cleanly in later slices.

What changed

  • add a new knowledge module with typed query/hit models, fusion, source registry, and KnowledgeRetrievalService
  • add the branch-only knowledge_recall tool plus prompt and docs wiring
  • add a native QMD adapter and Google Workspace source normalization under the same retrieval contract
  • wire channel and branch dispatch paths so retrieval-backed flows are available from branch work
  • add worker task preset groundwork and related prompt updates for later retrieval/task routing
  • update docs to clarify that branches, not the cortex bulletin, fan out to external knowledge sources

Review fixes included

  • gate QMD on source enablement instead of calling through when the source is disabled
  • keep scanning QMD results until enough valid hits are collected, so malformed leading rows do not suppress later good hits
  • resolve configured QMD MCP server names case-insensitively on the call_tool path
  • leak-scan outbound MCP tool arguments before any network call

Verification

  • cargo fmt --all
  • cargo test qmd -- --nocapture
  • cargo test retrieve_mixed_sources -- --nocapture
  • cargo test call_tool_resolves_server_name_case_insensitively -- --nocapture
  • cargo test call_tool_blocks_secret_bearing_arguments_before_outbound_call -- --nocapture
  • just preflight
  • just gate-pr

Residual risk

  • Live MCP/QMD integration coverage is still missing, so transport and schema drift are covered by focused tests rather than an end-to-end probe.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

Walkthrough

This PR introduces a comprehensive knowledge retrieval system with a new knowledge_recall tool that aggregates results from native memory, QMD, and Google Workspace sources. It adds worker task presets to shape retrieval workflows, implements result mediation for structured evidence payloads, makes embedding models optional, and refactors integration tests to be hermetic.

Changes

Cohort / File(s) Summary
Documentation
docs/content/docs/(core)/cortex.mdx, docs/content/docs/(core)/memory.mdx, docs/content/docs/(features)/tools.mdx, docs/design-docs/mcp.md, prompts/en/...
Updated docs to introduce knowledge_recall as distinct from memory_recall; clarifies delegation to branches and branch-visible retrieval options; documents task preset field for retrieval workflows and available tool secrets in worker prompts.
Knowledge Retrieval Module
src/knowledge.rs, src/knowledge/types.rs, src/knowledge/service.rs, src/knowledge/registry.rs, src/knowledge/qmd.rs, src/knowledge/fusion.rs, src/knowledge/google_workspace.rs
New public knowledge module with domain types (KnowledgeQuery, KnowledgeHit, KnowledgeProvenance, KnowledgeSourceDescriptor), KnowledgeRetrievalService orchestrating multi-source lookups, QMD adapter for MCP-based retrieval, Google Workspace normalization utilities, and reciprocal rank fusion for result aggregation.
Worker Task Preset System
src/agent/channel_dispatch.rs, src/agent/channel.rs, src/conversation/history.rs
Introduces WorkerTaskPreset enum; adds per-worker preset persistence in ChannelState and database; implements retrieval result mediation with structured payloads (RetrievalWorkerStatus, RetrievalWorkerSource) and formatters; threads preset through spawn/resume paths.
Tool Integration
src/tools.rs, src/tools/knowledge_recall.rs, src/tools/spawn_worker.rs, src/tools/mcp.rs, src/tools/memory_save.rs
Adds knowledge_recall tool to branch server with multi-source querying; updates spawn_worker with task_preset argument; expands McpToolOutput with server/tool identifiers and structured_content; adds secret scanning to MCP calls; guards memory save against missing embeddings.
MCP & Config
src/mcp.rs, src/api/mcp.rs, src/config.rs, src/config/types.rs, src/config/load.rs
Adds McpManager::call_tool method with preflight secret detection; canonicalizes MCP server names (especially "qmd") across read/create/update/delete paths and config loading; adds QMD_MCP_SERVER_NAME constant.
Memory Search
src/memory/search.rs
Makes embedding_model optional (Option<Arc>); adds new_without_embeddings constructor; guards vector search and provides graceful fallback when model unavailable.
Agent Lifecycle
src/agent/cortex.rs, src/agent/ingestion.rs
Updates pickup_one_ready_task and log_worker_started signatures to accept WorkerTaskPreset; uses embedding Option in warmup path; adds None argument to create_branch_tool_server in ingestion.
Library & Prompts
src/lib.rs, src/prompts/text.rs
Exposes new knowledge module at crate root; registers knowledge_recall_description prompt key.
Test Infrastructure & Migrations
tests/bulletin.rs, tests/context_dump.rs, migrations/20260308000001_worker_task_preset.sql, src/memory/store.rs
Refactors bulletin and context_dump tests to use hermetic harness with mock HTTP server, in-memory database, and isolated dependencies; adds worker_runs.task_preset column with default 'default'; switches test store to temp file-based SQLite for isolation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.79% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the main change: adding a branch retrieval plane with native QMD support, which is the primary focus of this comprehensive PR.
Description check ✅ Passed The PR description is well-structured and directly related to the changeset, detailing what was added, what changed, review fixes, and verification steps.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vsumner vsumner changed the title feat/f1 qmd retrieval slice feat(knowledge): add branch retrieval plane with native QMD support Mar 8, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/memory/search.rs (1)

79-112: ⚠️ Potential issue | 🟠 Major

Avoid panicking accessors on an optional embedding model.

new_without_embeddings() makes MemorySearch valid with embedding_model: None, but both public accessors still expect("embedding model unavailable"). Any caller that uses the new constructor and reaches these methods will crash the process instead of handling the missing capability.

♻️ Possible shape
-    pub fn embedding_model(&self) -> &EmbeddingModel {
-        self.embedding_model
-            .as_deref()
-            .expect("embedding model unavailable")
-    }
+    pub fn embedding_model(&self) -> Option<&EmbeddingModel> {
+        self.embedding_model.as_deref()
+    }

-    pub fn embedding_model_arc(&self) -> &Arc<EmbeddingModel> {
-        self.embedding_model
-            .as_ref()
-            .expect("embedding model unavailable")
-    }
+    pub fn embedding_model_arc(&self) -> Option<&Arc<EmbeddingModel>> {
+        self.embedding_model.as_ref()
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/memory/search.rs` around lines 79 - 112, The accessors embedding_model()
and embedding_model_arc() currently unwrap embedding_model and will panic when
MemorySearch is constructed via new_without_embeddings(); change these methods
to return Option<&EmbeddingModel> and Option<&Arc<EmbeddingModel>> (or Result if
you prefer explicit errors) instead of calling expect, update their signatures
and bodies to return self.embedding_model.as_deref() and
self.embedding_model.as_ref(), and then fix all call sites to handle the None
case (e.g., propagate an error or skip embedding-dependent logic) so no
unexpected process panic occurs.
src/agent/ingestion.rs (1)

477-488: ⚠️ Potential issue | 🟠 Major

Don't widen ingestion to external MCP connectors unless chunk egress is intentional.

process_chunk() is handing raw file content to this agent. Passing deps.mcp_manager.clone() into the branch tool server means uploaded/private chunks can now be sent to external MCP-backed tools during ingestion if the model decides to use them, which is a significant privacy/cost expansion for a path described as memory recall + save only. Prefer a narrower tool set here, or gate MCP tools behind explicit config.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/ingestion.rs` around lines 477 - 488, The current call to
create_branch_tool_server inside process_chunk() passes
deps.mcp_manager.clone(), which allows MCP-backed tools to access raw ingested
chunks; remove or gate that parameter so ingestion does not widen to external
MCP connectors. Update the call site that constructs the ToolServerHandle
(create_branch_tool_server) from process_chunk() to pass a restricted tool set
or None for the MCP manager argument (or conditionally pass deps.mcp_manager
only when an explicit config flag is enabled), and ensure any downstream uses of
the returned ToolServerHandle do not implicitly enable MCP egress during
ingestion; reference create_branch_tool_server, ToolServerHandle, process_chunk,
and deps.mcp_manager when making the change.
docs/content/docs/(features)/tools.mdx (1)

71-84: ⚠️ Potential issue | 🟡 Minor

Keep the branch tool inventory aligned with create_branch_tool_server().

These sections still under-report the live branch surface. worker_inspect is registered on branches today, and the per-process list also omits memory_delete. If these lists are meant to be illustrative, label them as non-exhaustive; otherwise readers get the wrong tool model from this page.

Also applies to: 170-187

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/content/docs/`(features)/tools.mdx around lines 71 - 84, The branch tool
inventory in the docs is out of sync with the actual registered tools; update
the listing to match create_branch_tool_server() by adding missing tools like
worker_inspect and memory_delete (or mark the list explicitly as
non-exhaustive). Locate the table/block showing the branch tools (the example
list that includes memory_recall, knowledge_recall, etc.) and either append the
missing entries (worker_inspect, memory_delete) or add a parenthetical note such
as “(non-exhaustive — see create_branch_tool_server())” so readers aren’t misled
about the live branch surface.
🧹 Nitpick comments (3)
migrations/20260307000001_worker_task_preset.sql (1)

1-1: Consider constraining task_preset in the schema.

This column backs an enum-like runtime value, but the migration currently accepts any string. A simple CHECK (task_preset IN ('default', 'retrieval')) keeps mixed-version or manual writes from persisting values the app can't interpret later.

🧱 Suggested constraint
-ALTER TABLE worker_runs ADD COLUMN task_preset TEXT NOT NULL DEFAULT 'default';
+ALTER TABLE worker_runs
+ADD COLUMN task_preset TEXT NOT NULL DEFAULT 'default'
+CHECK (task_preset IN ('default', 'retrieval'));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@migrations/20260307000001_worker_task_preset.sql` at line 1, The migration
adds worker_runs.task_preset as freeform text; change it to enforce allowed
values by adding a CHECK constraint (e.g., on table worker_runs for column
task_preset) restricting values to 'default' and 'retrieval' so only those
enum-like strings can be persisted; update the ALTER TABLE/ADD COLUMN statement
(or add a subsequent ALTER TABLE ... ADD CONSTRAINT named e.g.
worker_runs_task_preset_check) to include the CHECK (task_preset IN
('default','retrieval')) constraint and keep the same NOT NULL DEFAULT
'default'.
src/memory/search.rs (1)

221-225: The degraded-mode log message is misleading.

FTS matches collected above are still fused into the output after a vector-search failure, so "falling back to graph only" points debugging at the wrong path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/memory/search.rs` around lines 221 - 225, The log emitted in the
Err(error) arm of the vector search (the tracing::debug! call in
src/memory/search.rs) is misleading because full-text-search (FTS) matches are
still fused into results; change the message to accurately state that vector
search failed but FTS/graph matches will still be used (e.g., "vector search
failed, using graph+FTS results" or similar) and include the %error in the
message for context so debugging points to the correct fallback path.
src/knowledge/service.rs (1)

242-247: snippet should not be the full memory body.

This field goes straight into the tool payload. Cloning the entire memory content here makes large memories blow up recall responses and drown out fused results from other sources. Return a bounded excerpt instead.

✂️ Proposed fix
             let title = crate::summarize_first_non_empty_line(&result.memory.content, 80);
+            let snippet = crate::summarize_first_non_empty_line(&result.memory.content, 280);
             hits.push(KnowledgeHit {
                 id: result.memory.id.clone(),
                 title,
-                snippet: result.memory.content.clone(),
+                snippet,
                 content_type: format!("memory/{}", result.memory.memory_type),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/knowledge/service.rs` around lines 242 - 247, The KnowledgeHit being
built is copying the full memory body into snippet (in the construction of
KnowledgeHit for result.memory), which can bloat tool payloads; instead produce
a bounded excerpt: use an existing helper (e.g.,
crate::summarize_first_non_empty_line or a new helper like excerpt/truncate) to
create a truncated snippet from result.memory.content (e.g., cap to ~200–400
chars and append "…" when truncated) and assign that to snippet rather than
cloning the full content so large memories no longer overwhelm recall responses.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/agent/channel.rs`:
- Around line 373-381: The parsing branch treats any non-plain-JSON as
malformed; before calling serde_json::from_str on trimmed, normalize fenced JSON
by detecting and removing triple-backtick fences (both ```json and ``` variants)
and trimming surrounding whitespace into a new variable (e.g., normalized), then
call serde_json::from_str(&normalized) to produce the RetrievalWorkerPayload;
update the error branch to continue unchanged if parsing still fails.

In `@src/knowledge/fusion.rs`:
- Around line 19-27: The current string key built as format!("{}::{}",
hit.provenance.source_id, hit.id) can collide if either component contains "::";
change the map to use a structured key (e.g., a tuple (String, String) or a
small Key struct deriving Hash+Eq) instead of a concatenated string: update
fused_hits declaration from HashMap<String, FusedHit> to HashMap<(String,
String), FusedHit> (or HashMap<Key, FusedHit>), build the key as
(hit.provenance.source_id.clone(), hit.id.clone()) when inserting/looking up,
and adjust the .entry(...) calls and any downstream references to use the new
key type so distinct hits never merge incorrectly.

In `@src/knowledge/google_workspace.rs`:
- Around line 241-249: The status helper currently appends the authenticated
principal (status.principal) into the auth_prefix string, exposing PII; change
the logic in the auth_prefix construction (the match on auth_status / branch for
Some(status) if status.authenticated) to omit any use of status.principal and
simply return a non-identifying message like "credentials present" (remove the
principal formatting and unwrap_or_else usage that inserts the principal).
Ensure no other code in this block references status.principal for the
user-facing message.

In `@src/knowledge/registry.rs`:
- Around line 99-118: qmd_enablement currently detects QMD only by the MCP alias
string "qmd", which ignores valid endpoints using other aliases; update the
detection in qmd_enablement (and any places that feed default_source_ids) to use
explicit adapter/transport metadata on McpServerConfig (e.g., a type/adapter
field or capabilities flag) instead of server.name, or alternatively add a
config validation rule that reserves the "qmd" name and rejects/normalizes other
aliases; modify qmd_enablement to check
McpServerConfig.adapter/type/capabilities for a QMD transport and preserve the
existing enabled/disabled message semantics.

In `@src/knowledge/service.rs`:
- Around line 233-240: The loop currently awaits
store.record_access(&result.memory.id) for each recalled memory, adding per-item
I/O latency; change it to fire-and-forget: for each result clone or otherwise
move the needed handle/state (e.g., store clone or an Arc wrapper and the memory
id) into a tokio::spawn task and call store.record_access inside that task
without awaiting the join; ensure errors are handled/logged inside the spawned
task (use the same tracing::warn message there) so the retrieval hot path (the
loop over curated and the function that contains it) remains read-only and
returns immediately.

In `@src/tools/mcp.rs`:
- Around line 84-90: The handler currently converts MCP-declared failures into
an Err by checking result.is_error, which loses provenance; instead add an
is_error: bool field to the McpToolOutput struct and, where the code inspects
result.is_error (including the other occurrence around the block referenced at
127-136), populate that field and return Ok(McpToolOutput { server, tool,
result: ..., structured_content: ..., is_error: result.is_error }) so the
failure is returned as structured data rather than via the tool error channel.
- Around line 123-125: The code is duplicating structured_content by serializing
it into output_text via collect_result_text(&result) while also returning
result.structured_content raw, letting large JSON bypass MAX_TOOL_OUTPUT_BYTES;
update the handling so you either drop the extra raw structured_content or
enforce the same size/pruning on it. Specifically, adjust the logic around
collect_result_text, truncate_output, and the structured_content field: either
remove returning result.structured_content entirely (use the truncated
output_text only) or apply the same truncation/pruning (using
MAX_TOOL_OUTPUT_BYTES) to result.structured_content before attaching it to the
response so large MCP responses cannot exceed the limit.

In `@src/tools/spawn_worker.rs`:
- Around line 131-136: The task_preset schema still advertises "retrieval" as an
option even though retrieval is incompatible with worker_type="opencode"; update
the tool definition so callers are warned or prevented from selecting retrieval
for opencode workers — e.g., modify the "task_preset" description (and the
duplicate at the other occurrence) to state that "retrieval" must not be used
when worker_type is "opencode" or add a conditional validation rule that
rejects/omits the "retrieval" enum value when worker_type == "opencode"; refer
to the "task_preset" field and the worker_type value "opencode" when
implementing the change.

In `@tests/context_dump.rs`:
- Around line 365-376: The test is creating branch tool servers with
create_branch_tool_server(None, ...) which yields detached branches missing
spawn_worker registration; update tests to either (a) construct the branch tool
server from a real ChannelState (e.g., obtain/create a ChannelState and pass its
channel id/context into create_branch_tool_server) so the snapshots include
channel-spawned branch defs, or (b) explicitly label these snapshots as
“detached branch” cases; locate the calls to create_branch_tool_server and
adjust the test setup to use a ChannelState fixture or change the expected
snapshot labels accordingly so they reflect the presence/absence of spawn_worker
registration.

---

Outside diff comments:
In `@docs/content/docs/`(features)/tools.mdx:
- Around line 71-84: The branch tool inventory in the docs is out of sync with
the actual registered tools; update the listing to match
create_branch_tool_server() by adding missing tools like worker_inspect and
memory_delete (or mark the list explicitly as non-exhaustive). Locate the
table/block showing the branch tools (the example list that includes
memory_recall, knowledge_recall, etc.) and either append the missing entries
(worker_inspect, memory_delete) or add a parenthetical note such as
“(non-exhaustive — see create_branch_tool_server())” so readers aren’t misled
about the live branch surface.

In `@src/agent/ingestion.rs`:
- Around line 477-488: The current call to create_branch_tool_server inside
process_chunk() passes deps.mcp_manager.clone(), which allows MCP-backed tools
to access raw ingested chunks; remove or gate that parameter so ingestion does
not widen to external MCP connectors. Update the call site that constructs the
ToolServerHandle (create_branch_tool_server) from process_chunk() to pass a
restricted tool set or None for the MCP manager argument (or conditionally pass
deps.mcp_manager only when an explicit config flag is enabled), and ensure any
downstream uses of the returned ToolServerHandle do not implicitly enable MCP
egress during ingestion; reference create_branch_tool_server, ToolServerHandle,
process_chunk, and deps.mcp_manager when making the change.

In `@src/memory/search.rs`:
- Around line 79-112: The accessors embedding_model() and embedding_model_arc()
currently unwrap embedding_model and will panic when MemorySearch is constructed
via new_without_embeddings(); change these methods to return
Option<&EmbeddingModel> and Option<&Arc<EmbeddingModel>> (or Result if you
prefer explicit errors) instead of calling expect, update their signatures and
bodies to return self.embedding_model.as_deref() and
self.embedding_model.as_ref(), and then fix all call sites to handle the None
case (e.g., propagate an error or skip embedding-dependent logic) so no
unexpected process panic occurs.

---

Nitpick comments:
In `@migrations/20260307000001_worker_task_preset.sql`:
- Line 1: The migration adds worker_runs.task_preset as freeform text; change it
to enforce allowed values by adding a CHECK constraint (e.g., on table
worker_runs for column task_preset) restricting values to 'default' and
'retrieval' so only those enum-like strings can be persisted; update the ALTER
TABLE/ADD COLUMN statement (or add a subsequent ALTER TABLE ... ADD CONSTRAINT
named e.g. worker_runs_task_preset_check) to include the CHECK (task_preset IN
('default','retrieval')) constraint and keep the same NOT NULL DEFAULT
'default'.

In `@src/knowledge/service.rs`:
- Around line 242-247: The KnowledgeHit being built is copying the full memory
body into snippet (in the construction of KnowledgeHit for result.memory), which
can bloat tool payloads; instead produce a bounded excerpt: use an existing
helper (e.g., crate::summarize_first_non_empty_line or a new helper like
excerpt/truncate) to create a truncated snippet from result.memory.content
(e.g., cap to ~200–400 chars and append "…" when truncated) and assign that to
snippet rather than cloning the full content so large memories no longer
overwhelm recall responses.

In `@src/memory/search.rs`:
- Around line 221-225: The log emitted in the Err(error) arm of the vector
search (the tracing::debug! call in src/memory/search.rs) is misleading because
full-text-search (FTS) matches are still fused into results; change the message
to accurately state that vector search failed but FTS/graph matches will still
be used (e.g., "vector search failed, using graph+FTS results" or similar) and
include the %error in the message for context so debugging points to the correct
fallback path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7f7ba3db-e088-4491-b600-5dcfcc64063e

📥 Commits

Reviewing files that changed from the base of the PR and between 203c9d9 and d8db89c.

📒 Files selected for processing (32)
  • docs/content/docs/(core)/cortex.mdx
  • docs/content/docs/(core)/memory.mdx
  • docs/content/docs/(features)/tools.mdx
  • docs/design-docs/mcp.md
  • migrations/20260307000001_worker_task_preset.sql
  • prompts/en/branch.md.j2
  • prompts/en/tools/branch_description.md.j2
  • prompts/en/tools/knowledge_recall_description.md.j2
  • prompts/en/tools/spawn_worker_description.md.j2
  • prompts/en/worker.md.j2
  • src/agent/channel.rs
  • src/agent/channel_dispatch.rs
  • src/agent/cortex.rs
  • src/agent/ingestion.rs
  • src/conversation/history.rs
  • src/knowledge.rs
  • src/knowledge/fusion.rs
  • src/knowledge/google_workspace.rs
  • src/knowledge/qmd.rs
  • src/knowledge/registry.rs
  • src/knowledge/service.rs
  • src/knowledge/types.rs
  • src/lib.rs
  • src/mcp.rs
  • src/memory/search.rs
  • src/prompts/text.rs
  • src/tools.rs
  • src/tools/knowledge_recall.rs
  • src/tools/mcp.rs
  • src/tools/spawn_worker.rs
  • tests/bulletin.rs
  • tests/context_dump.rs

@vsumner vsumner force-pushed the feat/f1-qmd-retrieval-slice branch from d8db89c to d8e6cb5 Compare March 8, 2026 21:47
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
src/tools/mcp.rs (1)

151-167: ⚠️ Potential issue | 🟠 Major

Return transport failures as McpToolOutput, not Err.

This still sends connection/runtime failures through the tool error channel, so callers lose the new server / tool / is_error envelope հենց when they need it most. Keep MCP failures in-band as McpToolOutput { is_error: true, ... } so the model can recover consistently.

Suggested change
-        let result = self
-            .connection
-            .call_tool(&self.tool_name, args)
-            .await
-            .map_err(|error| {
-                McpToolError(format!(
-                    "server '{}' tool '{}' failed: {}",
-                    self.server_name, self.tool_name, error
-                ))
-            })?;
-
-        Ok(Self::format_output(
-            &self.server_name,
-            &self.tool_name,
-            &result,
-        ))
+        match self.connection.call_tool(&self.tool_name, args).await {
+            Ok(result) => Ok(Self::format_output(
+                &self.server_name,
+                &self.tool_name,
+                &result,
+            )),
+            Err(error) => Ok(McpToolOutput {
+                server: self.server_name.clone(),
+                tool: self.tool_name.clone(),
+                result: format!(
+                    "server '{}' tool '{}' failed: {}",
+                    self.server_name, self.tool_name, error
+                ),
+                structured_content: None,
+                is_error: true,
+            }),
+        }

Based on learnings "Tool errors are returned as structured results, not panics. The LLM sees the error and can recover (error-as-result for tools pattern)."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/mcp.rs` around lines 151 - 167, The call method currently maps
connection failures into an Err(McpToolError) which loses the structured MCP
envelope; instead, when connection.call_tool(...) fails, convert that error into
an in-band McpToolOutput with is_error: true (using the same envelope/fields
produced by Self::format_output) and return Ok(...) so callers receive
McpToolOutput for both success and transport/runtime failures; update the error
handling around connection.call_tool in async fn call to catch the error, build
a McpToolOutput indicating is_error=true (including server_name, tool_name and
the error message) via Self::format_output or the McpToolOutput constructor, and
return Ok(...) rather than Err(McpToolError).
src/config/load.rs (1)

224-268: ⚠️ Potential issue | 🟡 Minor

Reject canonical-name collisions during MCP config load.

Canonicalizing here means aliases like QMD and qmd now deserialize to the same name, but nothing in this path fails fast when that happens. The manager later keys reconciliation by config.name, so one definition can be silently shadowed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/load.rs` around lines 224 - 268, The MCP config loader must reject
duplicate canonical names (e.g., "QMD" vs "qmd") to avoid silent shadowing:
update the code that iterates/deserializes the list of TomlMcpServerConfig to
compute canonicalize_mcp_server_name(&raw.name) for each entry and maintain a
HashSet of seen canonical names, returning a ConfigError::Invalid when a
canonical name is already present; keep parse_mcp_server_config and its use of
canonicalize_mcp_server_name unchanged except ensure the caller does this
duplicate check before inserting/collecting McpServerConfig instances so
McpServerConfig.name cannot collide later.
src/api/mcp.rs (1)

217-301: ⚠️ Potential issue | 🟠 Major

update_mcp_server still ignores env and headers.

The create path writes both request.env and request.headers, but this PUT handler never rewrites or removes either field. Updating a stdio server's environment, rotating an HTTP auth header, or switching transports will silently leave stale config behind.

🛠️ Suggested fix
             if let Some(url) = &request.url {
                 table["url"] = toml_edit::value(url);
             } else {
                 table.remove("url");
             }
+            if !request.env.is_empty() {
+                let mut env_table = toml_edit::InlineTable::new();
+                for (key, value) in &request.env {
+                    env_table.insert(key, value.as_str().into());
+                }
+                table["env"] = toml_edit::value(env_table);
+            } else {
+                table.remove("env");
+            }
+            if !request.headers.is_empty() {
+                let mut headers_table = toml_edit::InlineTable::new();
+                for (key, value) in &request.headers {
+                    headers_table.insert(key, value.as_str().into());
+                }
+                table["headers"] = toml_edit::value(headers_table);
+            } else {
+                table.remove("headers");
+            }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/mcp.rs` around lines 217 - 301, update_mcp_server currently never
updates or removes the "env" and "headers" keys, leaving stale config; update
the loop that modifies the matched table (in update_mcp_server) to handle
request.env and request.headers the same way as command/args/url: if request.env
is Some and non-empty, write it into table["env"] (convert the map/dict into the
appropriate TOML table/array structure), if None or empty remove
table.remove("env"); do the same for request.headers using table["headers"] /
table.remove("headers") so updates and deletions are applied consistently.
src/tools.rs (1)

495-527: ⚠️ Potential issue | 🟠 Major

knowledge_recall disappears on branch servers that pass None.

KnowledgeRecallTool is only registered inside the Some(mcp_manager) branch, but the service also fronts native memory. The ingestion branch path already passes None, so those branches lose the new unified retrieval tool entirely instead of falling back to native-memory-only retrieval.

Suggested fix
-    mcp_manager: Option<Arc<crate::mcp::McpManager>>,
+    mcp_manager: Arc<crate::mcp::McpManager>,
@@
-    if let Some(mcp_manager) = mcp_manager {
-        server = server.tool(KnowledgeRecallTool::new(
-            memory_search,
-            runtime_config.clone(),
-            mcp_manager,
-        ));
-    }
+    server = server.tool(KnowledgeRecallTool::new(
+        memory_search,
+        runtime_config.clone(),
+        mcp_manager,
+    ));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools.rs` around lines 495 - 527, The KnowledgeRecallTool is only
registered when mcp_manager is Some, causing branch servers that pass None to
omit the tool; change the design to always register KnowledgeRecallTool but
accept and handle an optional mcp_manager internally. Update
KnowledgeRecallTool::new (or add KnowledgeRecallTool::with_optional) to take
Option<Arc<crate::mcp::McpManager>> (rather than requiring a concrete Arc) and
implement fallback behavior when None (use native-memory-only retrieval), then
replace the conditional block in ToolServer::new so you always call server =
server.tool(KnowledgeRecallTool::new(memory_search, runtime_config.clone(),
mcp_manager)); ensuring mcp_manager (the Option) is forwarded unchanged.
docs/content/docs/(features)/tools.mdx (1)

14-30: ⚠️ Potential issue | 🟡 Minor

Keep the overview table in sync with the later branch-tool sections.

memory_delete and worker_inspect are documented lower on this page, but they're missing from the top-level inventory, so the page reads inconsistently.

📝 Minimal doc fix
-Core tools include:
+Core tools include (non-exhaustive):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/content/docs/`(features)/tools.mdx around lines 14 - 30, The top-level
tools table is missing entries for the memory_delete and worker_inspect tools
which are documented later; update the table to include rows for `memory_delete`
(Purpose: remove a memory from the store; Consumers: Branch, Cortex, Compactor)
and `worker_inspect` (Purpose: inspect a running worker's state; Consumers:
Channel, Branch) so the inventory matches the later branch-tool sections and the
page reads consistently.
♻️ Duplicate comments (3)
src/agent/channel_dispatch.rs (1)

598-635: 🛠️ Refactor suggestion | 🟠 Major

Move this retrieval contract into prompts/ instead of embedding it here.

This is system-prompt content. Keeping it inline makes the preset harder to localize/version and easier to drift from the rest of the prompt set. Load it through the prompt engine and compose it here instead.

As per coding guidelines "Don't store prompts as string constants in Rust. System prompts live in prompts/ as markdown files. Load at startup or on demand".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel_dispatch.rs` around lines 598 - 635, The Retrieval contract
is embedded as a long string in apply_worker_task_preset when matching
WorkerTaskPreset::Retrieval; move that system-prompt text into the prompts/
directory as a markdown file (e.g., prompts/retrieval_contract.md), load it
through your prompt engine at startup or on demand (use the existing prompt
loader API/utility used elsewhere), and replace the embedded literal in
apply_worker_task_preset with a composition that formats task + the loaded
retrieval prompt (e.g., format!("{}\n\n{}", task, loaded_prompt)). Ensure
apply_worker_task_preset (and any callers) still returns the same combined
string and add a clear loader usage near where prompts are initialized so the
new file is bundled/loaded at runtime.
src/agent/channel.rs (1)

65-74: ⚠️ Potential issue | 🟠 Major

Default the optional retrieval source strings.

A missing tool, title, snippet, or similar field currently makes the whole payload fail deserialization, and mediate_retrieval_result_for_retrigger() then downgrades an otherwise usable retrieval into “malformed structured output.”

💡 Suggested hardening
 #[derive(Debug, Deserialize)]
 struct RetrievalWorkerSource {
     source: String,
+    #[serde(default)]
     tool: String,
     retrieval_mode: Option<String>,
+    #[serde(default)]
     locator: String,
+    #[serde(default)]
     title: String,
+    #[serde(default)]
     snippet: String,
+    #[serde(default)]
     why_it_matters: String,
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 65 - 74, The struct RetrievalWorkerSource
fails deserialization when optional string fields are missing; add serde
defaults so missing fields deserialize to empty strings instead of failing:
annotate the string fields (e.g., source, tool, locator, title, snippet,
why_it_matters) with #[serde(default)] so they default to "" while keeping
retrieval_mode as Option<String>, ensuring RetrievalWorkerSource still derives
Deserialize/Debug; this hardens mediate_retrieval_result_for_retrigger() against
partially populated payloads.
src/knowledge/registry.rs (1)

99-120: ⚠️ Potential issue | 🟠 Major

Use case-insensitive QMD discovery here too.

qmd_enablement() only matches the exact string "qmd". A config named QMD or Qmd will still resolve through the now case-insensitive MCP lookup, but this registry will mark it disabled and omit it from default_source_ids().

Suggested fix
     if mcp_servers
         .iter()
-        .any(|server| server.enabled && server.name == QMD_SOURCE_ID)
+        .any(|server| server.enabled && server.name.eq_ignore_ascii_case(QMD_SOURCE_ID))
     {
@@
     if mcp_servers
         .iter()
-        .any(|server| !server.enabled && server.name == QMD_SOURCE_ID)
+        .any(|server| !server.enabled && server.name.eq_ignore_ascii_case(QMD_SOURCE_ID))
     {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/knowledge/registry.rs` around lines 99 - 120, qmd_enablement currently
checks server.name == QMD_SOURCE_ID with case-sensitive comparison; change both
occurrences to a case-insensitive comparison (e.g., use
server.name.eq_ignore_ascii_case(QMD_SOURCE_ID) or equivalent) so names like
"QMD" or "Qmd" are detected the same as "qmd" while preserving the
enabled/disabled logic in the function qmd_enablement and the QMD_SOURCE_ID
symbol.
🧹 Nitpick comments (4)
src/memory/store.rs (2)

516-528: Good fix for test isolation; consider tempfile for automatic cleanup.

Using a UUID-based temporary file correctly isolates parallel tests, addressing the state leakage problem present when using the shared sqlite::memory: URI. However, these temp files are never cleaned up after tests complete.

The tempfile crate provides NamedTempFile or TempDir with automatic cleanup on drop, which would prevent accumulation of orphan files in CI environments.

♻️ Suggested refactor using tempfile crate

Add tempfile to dev-dependencies in Cargo.toml:

[dev-dependencies]
tempfile = "3"

Then refactor to use automatic cleanup:

     #[cfg(test)]
     pub async fn connect_in_memory() -> Arc<Self> {
         use sqlx::sqlite::SqliteConnectOptions;
+        use std::sync::OnceLock;
+        use tempfile::TempDir;
+
+        // Keep temp dir alive for the duration of tests
+        static TEMP_DIR: OnceLock<TempDir> = OnceLock::new();
+        let temp_dir = TEMP_DIR.get_or_init(|| TempDir::new().expect("temp dir"));

-        let database_path =
-            std::env::temp_dir().join(format!("spacebot-test-{}.sqlite", uuid::Uuid::new_v4()));
+        let database_path =
+            temp_dir.path().join(format!("spacebot-test-{}.sqlite", uuid::Uuid::new_v4()));
         let options = SqliteConnectOptions::new()
             .filename(&database_path)
             .create_if_missing(true);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/memory/store.rs` around lines 516 - 528, Replace the manual UUID-named
temp file with tempfile's auto-cleaning handle: add tempfile = "3" to
dev-dependencies, create a NamedTempFile (or TempDir) and use its path when
building SqliteConnectOptions (the current database_path / filename usage), then
let the NamedTempFile own and remove the file when dropped; keep the
single-connection pool creation via
sqlx::pool::PoolOptions::new().max_connections(1).connect_with(options).await
as-is but reference the temp file handle instead of uuid::Uuid::new_v4() so
tests auto-cleanup.

516-537: Consider applying the same isolation pattern to other test stores.

The relevant code snippets show that src/tasks/store.rs and src/projects/store.rs still use the shared sqlite::memory: URI, which is susceptible to the same cross-test state leakage this change fixes.

Would you like me to open an issue to track applying this UUID-based isolation pattern to the other test store helpers?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/memory/store.rs` around lines 516 - 537, Replace the in-memory shared
SQLite usage in src/tasks/store.rs and src/projects/store.rs with the same
per-test temp-file pattern used in src/memory/store.rs: create a unique
database_path using uuid::Uuid::new_v4(), build
SqliteConnectOptions::new().filename(&database_path).create_if_missing(true),
open a single-connection pool with
sqlx::pool::PoolOptions::<sqlx::Sqlite>::new().max_connections(1).connect_with(options).await.expect(...),
then run sqlx::migrate!("./migrations").run(&pool).await.expect(...), and return
the store using that pool (mirroring the Arc::new(Self { pool, ... }) pattern)
so each test gets an isolated file-backed DB instead of "sqlite::memory:".
prompts/en/tools/spawn_worker_description.md.j2 (1)

1-1: Call out the branch fallback for context-dependent lookups.

This new retrieval guidance makes it easy to send “look up the thing we were just discussing” to a worker, but the worker only sees the task text. Adding one sentence here to redirect context-dependent lookups to a branch would prevent a common misuse.

✏️ Suggested wording
-Spawn an independent worker process. By default uses a built-in agent with {tools} tools. The worker only sees the task description you provide — no conversation history. Use `task_preset: "retrieval"` when the built-in worker should search external knowledge tools not covered by branch retrieval (for example web connectors) and return synthesized evidence with provenance instead of a raw tool dump. Do not combine `task_preset: "retrieval"` with `worker_type: "opencode"`.{opencode_note}
+Spawn an independent worker process. By default uses a built-in agent with {tools} tools. The worker only sees the task description you provide — no conversation history. Use `task_preset: "retrieval"` when the built-in worker should search external knowledge tools not covered by branch retrieval (for example web connectors) and return synthesized evidence with provenance instead of a raw tool dump. If the lookup depends on earlier conversation context, use a branch instead. Do not combine `task_preset: "retrieval"` with `worker_type: "opencode"`.{opencode_note}

Based on learnings: Don't give workers channel context. Workers get a fresh prompt and a task. If a worker needs conversation context, that's a branch, not a worker.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prompts/en/tools/spawn_worker_description.md.j2` at line 1, The guidance in
the spawn_worker_description.md.j2 text needs an extra sentence calling out that
context-dependent lookups should use a branch rather than a worker; update the
template string that begins "Spawn an independent worker process..." to append a
clear one-sentence fallback like "If the lookup depends on recent conversation
context, use a branch for that context rather than sending it to a worker" (or
similar wording), and ensure this note is placed near the sentence about the
worker only seeing the task description so readers see the branch fallback
immediately when they read about context limitations.
tests/context_dump.rs (1)

24-340: Extract the hermetic bootstrap into shared test support.

This harness now duplicates most of tests/bulletin.rs—mock server setup, LLM config, seeded memory, and bootstrap flow. Keeping two copies will drift the test environment and make future harness fixes easy to miss.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/context_dump.rs` around lines 24 - 340, The test file duplicates the
hermetic bootstrap/harness logic used in tests/bulletin.rs (types and functions
like ContextDumpHarness, PreparedMockServer, prepare_mock_server,
MockResponsesState, mock_responses_handler, test_llm_config,
resolved_agent_config, seed_memory_fixtures, bootstrap_harness_inner and
bootstrap_harness); extract these into a single shared test support module (eg.
tests/support or tests/common) and make the helpers pub(crate) so both tests
IMPORT them instead of redefining; replace the duplicated definitions in
context_dump.rs with use statements that import the extracted symbols and adjust
any visibility or path issues (ensure prepare_mock_server returns
PreparedMockServer, bootstrap_harness_inner remains async and
ContextDumpHarness.Drop still aborts server_task) so both bulletin.rs and
context_dump.rs reuse the same harness code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@migrations/20260308000001_worker_task_preset.sql`:
- Line 1: The migration adds worker_runs.task_preset without constraining
values—update the migration so the column has a CHECK constraint limiting values
to the known preset set (e.g., 'default' plus any other runtime presets) and
ensure the DEFAULT 'default' is one of those values; modify the ALTER TABLE
statement that adds task_preset (or add an ADD CONSTRAINT CHECK on
worker_runs.task_preset) to use task_preset IN ('default', '<other_preset1>',
'<other_preset2>') matching the runtime constants so invalid strings are
rejected at the schema layer.

In `@src/agent/cortex.rs`:
- Around line 1513-1522: The current match on
deps.memory_search.embedding_model_arc() treats None as an error by pushing into
the errors vector, which prevents reaching WarmupState::Warm when embeddings are
intentionally absent; update the match in cortex.rs so that the None branch does
not push an error (treat as "skipped" not "failed")—leave or set embedding_ready
as appropriate but do not append to errors; keep the existing
embed_one("warmup") Ok/Err handling and error push for real embed failures so
embedding_ready, errors, and the WarmupState::Warm transition behave correctly.

In `@src/tools/knowledge_recall.rs`:
- Around line 349-350: The test incorrectly asserts a generic negative invariant
assert!(!output.summary.contains("query")) which is unrelated to the behavior
under test and brittle; remove that assertion (the second assert) or replace it
with a narrower check tied to the expected behavior (for example
assert!(!output.summary.contains("query:")) or asserting absence/presence of a
specific unwanted phrase), keeping the existing positive assertion
assert!(output.summary.contains("qmd")) and updating the test around
output.summary accordingly.

In `@src/tools/memory_save.rs`:
- Around line 293-303: The memory_save flow calls
embedding_model_arc().embed_one(...) after calling store.save(&memory), which
can cause memory_save to return an error even though the DB row was persisted;
either preflight the embedder or compensate on failure. Fix by moving the
embedding_model_arc() availability check (and optionally the embed_one call or a
lightweight preflight) to before calling store.save(&memory) in memory_save, or
if you prefer post-save behavior then catch embedding errors and perform a
compensating rollback by deleting the saved memory row (use the same
store.delete/delete_by_id or rollback method used elsewhere) and undo any
association creation before returning MemorySaveError so no partial/duplicated
records remain.

---

Outside diff comments:
In `@docs/content/docs/`(features)/tools.mdx:
- Around line 14-30: The top-level tools table is missing entries for the
memory_delete and worker_inspect tools which are documented later; update the
table to include rows for `memory_delete` (Purpose: remove a memory from the
store; Consumers: Branch, Cortex, Compactor) and `worker_inspect` (Purpose:
inspect a running worker's state; Consumers: Channel, Branch) so the inventory
matches the later branch-tool sections and the page reads consistently.

In `@src/api/mcp.rs`:
- Around line 217-301: update_mcp_server currently never updates or removes the
"env" and "headers" keys, leaving stale config; update the loop that modifies
the matched table (in update_mcp_server) to handle request.env and
request.headers the same way as command/args/url: if request.env is Some and
non-empty, write it into table["env"] (convert the map/dict into the appropriate
TOML table/array structure), if None or empty remove table.remove("env"); do the
same for request.headers using table["headers"] / table.remove("headers") so
updates and deletions are applied consistently.

In `@src/config/load.rs`:
- Around line 224-268: The MCP config loader must reject duplicate canonical
names (e.g., "QMD" vs "qmd") to avoid silent shadowing: update the code that
iterates/deserializes the list of TomlMcpServerConfig to compute
canonicalize_mcp_server_name(&raw.name) for each entry and maintain a HashSet of
seen canonical names, returning a ConfigError::Invalid when a canonical name is
already present; keep parse_mcp_server_config and its use of
canonicalize_mcp_server_name unchanged except ensure the caller does this
duplicate check before inserting/collecting McpServerConfig instances so
McpServerConfig.name cannot collide later.

In `@src/tools.rs`:
- Around line 495-527: The KnowledgeRecallTool is only registered when
mcp_manager is Some, causing branch servers that pass None to omit the tool;
change the design to always register KnowledgeRecallTool but accept and handle
an optional mcp_manager internally. Update KnowledgeRecallTool::new (or add
KnowledgeRecallTool::with_optional) to take Option<Arc<crate::mcp::McpManager>>
(rather than requiring a concrete Arc) and implement fallback behavior when None
(use native-memory-only retrieval), then replace the conditional block in
ToolServer::new so you always call server =
server.tool(KnowledgeRecallTool::new(memory_search, runtime_config.clone(),
mcp_manager)); ensuring mcp_manager (the Option) is forwarded unchanged.

In `@src/tools/mcp.rs`:
- Around line 151-167: The call method currently maps connection failures into
an Err(McpToolError) which loses the structured MCP envelope; instead, when
connection.call_tool(...) fails, convert that error into an in-band
McpToolOutput with is_error: true (using the same envelope/fields produced by
Self::format_output) and return Ok(...) so callers receive McpToolOutput for
both success and transport/runtime failures; update the error handling around
connection.call_tool in async fn call to catch the error, build a McpToolOutput
indicating is_error=true (including server_name, tool_name and the error
message) via Self::format_output or the McpToolOutput constructor, and return
Ok(...) rather than Err(McpToolError).

---

Duplicate comments:
In `@src/agent/channel_dispatch.rs`:
- Around line 598-635: The Retrieval contract is embedded as a long string in
apply_worker_task_preset when matching WorkerTaskPreset::Retrieval; move that
system-prompt text into the prompts/ directory as a markdown file (e.g.,
prompts/retrieval_contract.md), load it through your prompt engine at startup or
on demand (use the existing prompt loader API/utility used elsewhere), and
replace the embedded literal in apply_worker_task_preset with a composition that
formats task + the loaded retrieval prompt (e.g., format!("{}\n\n{}", task,
loaded_prompt)). Ensure apply_worker_task_preset (and any callers) still returns
the same combined string and add a clear loader usage near where prompts are
initialized so the new file is bundled/loaded at runtime.

In `@src/agent/channel.rs`:
- Around line 65-74: The struct RetrievalWorkerSource fails deserialization when
optional string fields are missing; add serde defaults so missing fields
deserialize to empty strings instead of failing: annotate the string fields
(e.g., source, tool, locator, title, snippet, why_it_matters) with
#[serde(default)] so they default to "" while keeping retrieval_mode as
Option<String>, ensuring RetrievalWorkerSource still derives Deserialize/Debug;
this hardens mediate_retrieval_result_for_retrigger() against partially
populated payloads.

In `@src/knowledge/registry.rs`:
- Around line 99-120: qmd_enablement currently checks server.name ==
QMD_SOURCE_ID with case-sensitive comparison; change both occurrences to a
case-insensitive comparison (e.g., use
server.name.eq_ignore_ascii_case(QMD_SOURCE_ID) or equivalent) so names like
"QMD" or "Qmd" are detected the same as "qmd" while preserving the
enabled/disabled logic in the function qmd_enablement and the QMD_SOURCE_ID
symbol.

---

Nitpick comments:
In `@prompts/en/tools/spawn_worker_description.md.j2`:
- Line 1: The guidance in the spawn_worker_description.md.j2 text needs an extra
sentence calling out that context-dependent lookups should use a branch rather
than a worker; update the template string that begins "Spawn an independent
worker process..." to append a clear one-sentence fallback like "If the lookup
depends on recent conversation context, use a branch for that context rather
than sending it to a worker" (or similar wording), and ensure this note is
placed near the sentence about the worker only seeing the task description so
readers see the branch fallback immediately when they read about context
limitations.

In `@src/memory/store.rs`:
- Around line 516-528: Replace the manual UUID-named temp file with tempfile's
auto-cleaning handle: add tempfile = "3" to dev-dependencies, create a
NamedTempFile (or TempDir) and use its path when building SqliteConnectOptions
(the current database_path / filename usage), then let the NamedTempFile own and
remove the file when dropped; keep the single-connection pool creation via
sqlx::pool::PoolOptions::new().max_connections(1).connect_with(options).await
as-is but reference the temp file handle instead of uuid::Uuid::new_v4() so
tests auto-cleanup.
- Around line 516-537: Replace the in-memory shared SQLite usage in
src/tasks/store.rs and src/projects/store.rs with the same per-test temp-file
pattern used in src/memory/store.rs: create a unique database_path using
uuid::Uuid::new_v4(), build
SqliteConnectOptions::new().filename(&database_path).create_if_missing(true),
open a single-connection pool with
sqlx::pool::PoolOptions::<sqlx::Sqlite>::new().max_connections(1).connect_with(options).await.expect(...),
then run sqlx::migrate!("./migrations").run(&pool).await.expect(...), and return
the store using that pool (mirroring the Arc::new(Self { pool, ... }) pattern)
so each test gets an isolated file-backed DB instead of "sqlite::memory:".

In `@tests/context_dump.rs`:
- Around line 24-340: The test file duplicates the hermetic bootstrap/harness
logic used in tests/bulletin.rs (types and functions like ContextDumpHarness,
PreparedMockServer, prepare_mock_server, MockResponsesState,
mock_responses_handler, test_llm_config, resolved_agent_config,
seed_memory_fixtures, bootstrap_harness_inner and bootstrap_harness); extract
these into a single shared test support module (eg. tests/support or
tests/common) and make the helpers pub(crate) so both tests IMPORT them instead
of redefining; replace the duplicated definitions in context_dump.rs with use
statements that import the extracted symbols and adjust any visibility or path
issues (ensure prepare_mock_server returns PreparedMockServer,
bootstrap_harness_inner remains async and ContextDumpHarness.Drop still aborts
server_task) so both bulletin.rs and context_dump.rs reuse the same harness
code.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dcf38395-4579-4be2-a96a-5716727a3b30

📥 Commits

Reviewing files that changed from the base of the PR and between d8db89c and d8e6cb5.

📒 Files selected for processing (38)
  • docs/content/docs/(core)/cortex.mdx
  • docs/content/docs/(core)/memory.mdx
  • docs/content/docs/(features)/tools.mdx
  • docs/design-docs/mcp.md
  • migrations/20260308000001_worker_task_preset.sql
  • prompts/en/branch.md.j2
  • prompts/en/tools/branch_description.md.j2
  • prompts/en/tools/knowledge_recall_description.md.j2
  • prompts/en/tools/spawn_worker_description.md.j2
  • prompts/en/worker.md.j2
  • src/agent/channel.rs
  • src/agent/channel_dispatch.rs
  • src/agent/cortex.rs
  • src/agent/ingestion.rs
  • src/api/mcp.rs
  • src/config.rs
  • src/config/load.rs
  • src/config/types.rs
  • src/conversation/history.rs
  • src/knowledge.rs
  • src/knowledge/fusion.rs
  • src/knowledge/google_workspace.rs
  • src/knowledge/qmd.rs
  • src/knowledge/registry.rs
  • src/knowledge/service.rs
  • src/knowledge/types.rs
  • src/lib.rs
  • src/mcp.rs
  • src/memory/search.rs
  • src/memory/store.rs
  • src/prompts/text.rs
  • src/tools.rs
  • src/tools/knowledge_recall.rs
  • src/tools/mcp.rs
  • src/tools/memory_save.rs
  • src/tools/spawn_worker.rs
  • tests/bulletin.rs
  • tests/context_dump.rs
🚧 Files skipped from review as they are similar to previous changes (9)
  • docs/design-docs/mcp.md
  • src/knowledge/google_workspace.rs
  • src/knowledge/service.rs
  • src/conversation/history.rs
  • src/lib.rs
  • src/prompts/text.rs
  • prompts/en/branch.md.j2
  • prompts/en/tools/knowledge_recall_description.md.j2
  • docs/content/docs/(core)/cortex.mdx

@@ -0,0 +1 @@
ALTER TABLE worker_runs ADD COLUMN task_preset TEXT NOT NULL DEFAULT 'default';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a CHECK constraint for task_preset.

Right now any string can be stored here, and the runtime later warns and falls back to default. That makes bad data silently change resumed worker behavior instead of failing fast. Please constrain the column to the known preset values at the schema layer.

Suggested change
-ALTER TABLE worker_runs ADD COLUMN task_preset TEXT NOT NULL DEFAULT 'default';
+ALTER TABLE worker_runs
+ADD COLUMN task_preset TEXT NOT NULL DEFAULT 'default'
+CHECK (task_preset IN ('default', 'retrieval'));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@migrations/20260308000001_worker_task_preset.sql` at line 1, The migration
adds worker_runs.task_preset without constraining values—update the migration so
the column has a CHECK constraint limiting values to the known preset set (e.g.,
'default' plus any other runtime presets) and ensure the DEFAULT 'default' is
one of those values; modify the ALTER TABLE statement that adds task_preset (or
add an ADD CONSTRAINT CHECK on worker_runs.task_preset) to use task_preset IN
('default', '<other_preset1>', '<other_preset2>') matching the runtime constants
so invalid strings are rejected at the schema layer.

Comment on lines +1513 to 1522
match deps.memory_search.embedding_model_arc() {
Some(embedding_model) => match embedding_model.embed_one("warmup").await {
Ok(_) => embedding_ready = true,
Err(error) => {
errors.push(format!("embedding warmup failed: {error}"));
}
},
None => {
errors.push("embedding warmup skipped: embedding model unavailable".to_string());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't degrade warmup when embeddings are intentionally absent.

embedding_model_arc() == None currently goes into errors, so a no-embeddings setup never reaches WarmupState::Warm even if bulletin generation succeeds. That leaves the optional-embeddings path permanently degraded and can keep readiness/forced-warmup paths firing for a supported configuration.

Suggested fix
         match deps.memory_search.embedding_model_arc() {
             Some(embedding_model) => match embedding_model.embed_one("warmup").await {
                 Ok(_) => embedding_ready = true,
                 Err(error) => {
                     errors.push(format!("embedding warmup failed: {error}"));
                 }
             },
-            None => {
-                errors.push("embedding warmup skipped: embedding model unavailable".to_string());
-            }
+            None => {
+                tracing::info!("embedding warmup skipped: embedding model unavailable");
+            }
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
match deps.memory_search.embedding_model_arc() {
Some(embedding_model) => match embedding_model.embed_one("warmup").await {
Ok(_) => embedding_ready = true,
Err(error) => {
errors.push(format!("embedding warmup failed: {error}"));
}
},
None => {
errors.push("embedding warmup skipped: embedding model unavailable".to_string());
}
match deps.memory_search.embedding_model_arc() {
Some(embedding_model) => match embedding_model.embed_one("warmup").await {
Ok(_) => embedding_ready = true,
Err(error) => {
errors.push(format!("embedding warmup failed: {error}"));
}
},
None => {
tracing::info!("embedding warmup skipped: embedding model unavailable");
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/cortex.rs` around lines 1513 - 1522, The current match on
deps.memory_search.embedding_model_arc() treats None as an error by pushing into
the errors vector, which prevents reaching WarmupState::Warm when embeddings are
intentionally absent; update the match in cortex.rs so that the None branch does
not push an error (treat as "skipped" not "failed")—leave or set embedding_ready
as appropriate but do not append to errors; keep the existing
embed_one("warmup") Ok/Err handling and error push for real embed failures so
embedding_ready, errors, and the WarmupState::Warm transition behave correctly.

Comment on lines +349 to +350
assert!(output.summary.contains("qmd"));
assert!(!output.summary.contains("query"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Drop the generic contains("query") assertion.

This isn't tied to the behavior under test and can fail if a title or source-status message legitimately contains that word. Remove it or replace it with a narrower invariant.

🧪 Proposed test fix
         assert!(output.summary.contains("qmd"));
-        assert!(!output.summary.contains("query"));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert!(output.summary.contains("qmd"));
assert!(!output.summary.contains("query"));
assert!(output.summary.contains("qmd"));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/knowledge_recall.rs` around lines 349 - 350, The test incorrectly
asserts a generic negative invariant assert!(!output.summary.contains("query"))
which is unrelated to the behavior under test and brittle; remove that assertion
(the second assert) or replace it with a narrower check tied to the expected
behavior (for example assert!(!output.summary.contains("query:")) or asserting
absence/presence of a specific unwanted phrase), keeping the existing positive
assertion assert!(output.summary.contains("qmd")) and updating the test around
output.summary accordingly.

Comment on lines 293 to 303
let embedding = self
.memory_search
.embedding_model_arc()
.ok_or_else(|| {
MemorySaveError(
"Failed to generate embedding: embedding model unavailable".to_string(),
)
})?
.embed_one(&args.content)
.await
.map_err(|e| MemorySaveError(format!("Failed to generate embedding: {e}")))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid returning a failed save after the memory was already persisted.

This new embedding_model_arc() guard runs after store.save(&memory) and association creation, so memory_save can now return an error even though the row already exists. A retry will duplicate the memory and leave a partially indexed record behind. Preflight the embedder before the first write, or compensate by deleting/rolling back the saved memory on any post-save embedding failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/memory_save.rs` around lines 293 - 303, The memory_save flow calls
embedding_model_arc().embed_one(...) after calling store.save(&memory), which
can cause memory_save to return an error even though the DB row was persisted;
either preflight the embedder or compensate on failure. Fix by moving the
embedding_model_arc() availability check (and optionally the embed_one call or a
lightweight preflight) to before calling store.save(&memory) in memory_save, or
if you prefer post-save behavior then catch embedding errors and perform a
compensating rollback by deleting the saved memory row (use the same
store.delete/delete_by_id or rollback method used elsewhere) and undo any
association creation before returning MemorySaveError so no partial/duplicated
records remain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant