-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Problem
When users ask questions with ambiguous or overloaded terms, the assistant's retrieval tools may not find the most relevant documentation. This was identified during feedback from MNE-Python maintainers (mne-tools/mne-python#13702):
-
BDF status channel question: "How do I parse the status channel of a BDF file correctly?" -- The word "status" is ambiguous (general status vs. the specific BioSemi
statuschannel). The assistant retrieved general BDF reading docs but missed the specificstatuschannel behavior documented inread_raw_bdf(). -
Eyetracking unit conversion: "How can I convert the units of eyetracking data from pixels-on-screen to radians of visual angle using MNE-Python?" -- The assistant said MNE doesn't have built-in functions for this, but
mne.preprocessing.eyetracking.convert_unitsexists and is used in 3 tutorials.
In both cases, rephrasing the question (e.g., putting status in backticks, or asking about the specific function) produced correct answers. The retrieval worked; the query formulation was the bottleneck.
Proposed Solution
Add a query expansion step before tool calls. When the user asks a question, the agent (or a lightweight pre-processing step) generates multiple reformulations to improve retrieval coverage:
Original: "How do I parse the status channel of a BDF file correctly?"
Expanded queries:
read_raw_bdf status channelBDF status channel trigger eventsBioSemi status channel parsing
Original: "How can I convert eyetracking data from pixels to radians?"
Expanded queries:
eyetracking convert_units pixels radiansmne.preprocessing.eyetrackingeye tracking unit conversion visual angle
Implementation Options
- Prompt-based expansion: Add instructions to the system prompt telling the agent to generate multiple search queries per question and call tools multiple times with different phrasings.
- Pre-processing agent: A lightweight agent node that runs before the main agent, expanding the user's question into multiple retrieval queries. This could be a simple LLM call with a focused prompt.
- Tool-level expansion: Modify the retrieval tools themselves to do query expansion internally (e.g., generate synonyms, extract technical terms).
Option 1 is simplest and can be tried first. Option 2 is more robust for production.
Acceptance Criteria
- Ambiguous questions produce expanded search queries
- The BDF status channel question returns the correct
read_raw_bdf()documentation - The eyetracking conversion question finds
mne.preprocessing.eyetracking.convert_units - Query expansion does not significantly increase response latency (< 1s additional)
- Works across all communities, not just MNE
References
- MNE PR discussion: Add OSA chat widget to documentation mne-tools/mne-python#13702
- Feedback from @scott-huberty (eyetracking) and @cbrnr (BDF status)