Enhance chat completion functionality to support OpenAI-style message history#1674
Enhance chat completion functionality to support OpenAI-style message history#1674mdemoret-nv wants to merge 1 commit intoNVIDIA:developfrom
Conversation
… history - Updated the chat completion function to accept both single input messages and full conversation histories, allowing for context-aware responses. - Introduced a utility function to convert NAT messages to LangChain messages, prepending a system prompt if necessary. - Improved error handling to provide more informative responses in case of failures. - Added usage tracking for API compatibility, including token counts for prompts and completions. This update enhances the overall user experience by enabling more dynamic and contextually relevant interactions. Signed-off-by: Michael Demoret <mdemoret@nvidia.com>
WalkthroughThe chat completion function now supports OpenAI-style chat history. A new internal helper converts NAT messages to LangChain format. The handler signature changed to accept ChatRequest or single messages and returns structured ChatResponse objects with usage statistics instead of simple strings. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Handler as _chat_completion
participant Converter as _messages_to_langchain_messages
participant LLM as LangChain LLM
participant Response as ChatResponse Builder
alt Chat History Input
Client->>Handler: ChatRequest with messages
Handler->>Converter: NAT messages + system_prompt
Converter->>Converter: Convert to LangChain format
Converter-->>Handler: LangChain messages
else Single Message Input
Client->>Handler: String message
Handler->>Handler: Convert to ChatRequest
Handler->>Converter: Message + system_prompt
Converter-->>Handler: LangChain messages
end
Handler->>LLM: Invoke with messages
LLM-->>Handler: Raw response + token counts
Handler->>Response: Calculate usage stats
Response-->>Handler: ChatResponse object
Handler-->>Client: ChatResponse or str
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/nvidia_nat_core/src/nat/tool/chat_completion.py (1)
49-60: Add return type hint and use more specific type annotation fornat_messages.The function is missing a return type annotation, and
nat_messages: listis overly generic. Per coding guidelines, all functions require type hints and should prefer typing abstractions.Additionally, consider using iterable unpacking (Ruff RUF005) for cleaner list construction.
Proposed fix
+from collections.abc import Sequence + +from langchain_core.messages import BaseMessage +from nat.data_models.api_server import Message + def _messages_to_langchain_messages( - nat_messages: list, + nat_messages: Sequence[Message], system_prompt: str, -): +) -> list[BaseMessage]: """Convert NAT Message list to LangChain BaseMessage list with system prompt prepended if needed.""" from langchain_core.messages.utils import convert_to_messages message_dicts = [m.model_dump() for m in nat_messages] has_system = any(d.get("role") == "system" for d in message_dicts) if not has_system and system_prompt: - message_dicts = [{"role": "system", "content": system_prompt}] + message_dicts + message_dicts = [{"role": "system", "content": system_prompt}, *message_dicts] return convert_to_messages(message_dicts)Note: The
Messageimport may already be available or need to be added. TheBaseMessageimport can stay inside the function if preferred to avoid top-level langchain dependency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py` around lines 49 - 60, Add proper type hints and tighten the nat_messages type: change the parameter nat_messages: list to nat_messages: Iterable[Message] (import Message from langchain_core.messages or the appropriate module) and add a return type of list[BaseMessage] (import BaseMessage or keep it as a local import). Inside _messages_to_langchain_messages, build the final message_dicts using iterable unpacking instead of concatenation when prepending the system prompt (e.g., [{"role":"system","content": system_prompt}, *message_dicts]) and keep the convert_to_messages call to produce the final List[BaseMessage]; ensure necessary imports for Message and BaseMessage are added or referenced locally in the function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py`:
- Around line 118-131: The except block currently swallows errors, fails to log
stack traces, and returns the raw exception text to the user; update the handler
to log the full exception via logger.exception(...) when the outer Exception is
caught and also remove the use of str(e) in the returned user-facing message,
returning a generic error string instead; replace the silent nested try-except
around GlobalTypeConverter.get().convert(...) with a logged except (e.g., log
conversion errors with logger.exception(...) referencing
GlobalTypeConverter.get().convert and the local variables msg/last_content) so
conversion failures aren’t silently dropped, and ensure the final return uses
only a non-sensitive generic message that may include last_content but not the
raw exception details.
- Around line 96-99: The current extraction uses hasattr(response, "text") and
response.text(), which is incorrect for LangChain BaseMessage; update the logic
in the chat completion handler where response and output_text are set (in
packages/nvidia_nat_core/src/nat/tool/chat_completion.py) to read from
response.content instead: replace the branch that checks for response.text()
with code that assigns output_text = response.content if it's already a string
or serializes it (e.g., str(response.content)) otherwise, ensuring you still
handle the case where response is a plain str by keeping the existing
isinstance(response, str) check.
---
Nitpick comments:
In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py`:
- Around line 49-60: Add proper type hints and tighten the nat_messages type:
change the parameter nat_messages: list to nat_messages: Iterable[Message]
(import Message from langchain_core.messages or the appropriate module) and add
a return type of list[BaseMessage] (import BaseMessage or keep it as a local
import). Inside _messages_to_langchain_messages, build the final message_dicts
using iterable unpacking instead of concatenation when prepending the system
prompt (e.g., [{"role":"system","content": system_prompt}, *message_dicts]) and
keep the convert_to_messages call to produce the final List[BaseMessage]; ensure
necessary imports for Message and BaseMessage are added or referenced locally in
the function.
| if isinstance(response, str): | ||
| output_text = response | ||
| else: | ||
| output_text = response.text() if hasattr(response, "text") else str(response.content) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, examine the actual code in the file to understand context and imports
head -120 packages/nvidia_nat_core/src/nat/tool/chat_completion.py | tail -60Repository: NVIDIA/NeMo-Agent-Toolkit
Length of output: 2694
🏁 Script executed:
# Check imports to understand what types are being used
head -30 packages/nvidia_nat_core/src/nat/tool/chat_completion.pyRepository: NVIDIA/NeMo-Agent-Toolkit
Length of output: 1351
🏁 Script executed:
# Search for how LangChain message content is accessed in the codebase
rg -n 'BaseMessage|\.content|\.text\(' packages/nvidia_nat_core/src/nat/ --type=py | head -30Repository: NVIDIA/NeMo-Agent-Toolkit
Length of output: 3934
🏁 Script executed:
# Check if there are any LangChain related files that show message handling patterns
find packages -name "*.py" -type f -exec rg -l "BaseMessage" {} \; | head -10Repository: NVIDIA/NeMo-Agent-Toolkit
Length of output: 898
🌐 Web query:
LangChain BaseMessage content property API documentation
💡 Result:
-
Python (
langchain_core.messages.base.BaseMessage):contentis a required field (str | list[str | dict]) holding the message payload;BaseMessageis the abstract base forHumanMessage,AIMessage, etc. See the API doc forBaseMessage(params/attributes includecontent,additional_kwargs,response_metadata,type, etc.). [1] -
JavaScript/TypeScript (
@langchain/coremessages): message classes (e.g.,AIMessage) inherit fromBaseMessageand expose acontentproperty whose type supports either plain text or structured content blocks (LangChain v1 message content blocks). SeeAIMessage(showscontenttype) and the Messages docs (explains whatcontentcan contain). [2], [3]
Sources: [1] [2] [3]
Simplify message content extraction to use LangChain's content property directly.
LangChain's BaseMessage uses a content property (str | list), not a text() method. The hasattr(response, "text") check will always be False for LangChain messages. Simplify line 99 to: output_text = response.content if isinstance(response.content, str) else str(response.content) (or simply output_text = response.content if content is guaranteed to be a string).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py` around lines 96 -
99, The current extraction uses hasattr(response, "text") and response.text(),
which is incorrect for LangChain BaseMessage; update the logic in the chat
completion handler where response and output_text are set (in
packages/nvidia_nat_core/src/nat/tool/chat_completion.py) to read from
response.content instead: replace the branch that checks for response.text()
with code that assigns output_text = response.content if it's already a string
or serializes it (e.g., str(response.content)) otherwise, ensuring you still
handle the case where response is a plain str by keeping the existing
isinstance(response, str) check.
| except Exception as e: | ||
| # Fallback response if LLM call fails | ||
| return (f"I apologize, but I encountered an error while processing your " | ||
| f"query: '{query}'. Please try rephrasing your question or try " | ||
| f"again later. Error: {str(e)}") | ||
|
|
||
| yield _chat_completion | ||
| last_content = "" | ||
| try: | ||
| msg = GlobalTypeConverter.get().convert(chat_request_or_message, to_type=ChatRequest) | ||
| if msg.messages: | ||
| last = msg.messages[-1].content | ||
| last_content = last if isinstance(last, str) else str(last) | ||
| except Exception: | ||
| pass | ||
| return ( | ||
| f"I apologize, but I encountered an error while processing your " | ||
| f"query: '{last_content}'. Please try rephrasing your question or try " | ||
| f"again later. Error: {str(e)}" | ||
| ) |
There was a problem hiding this comment.
Add logging for caught exceptions and avoid exposing raw error details to users.
The exception handling has several issues:
- Missing logging: Per coding guidelines, when catching exceptions without re-raising, use
logger.exception()to capture the full stack trace. - Silent exception swallowing (lines 125-126): The nested
try-except-passsilently discards errors, making debugging difficult (Ruff S110). - Information leakage: Exposing
str(e)in user-facing messages may reveal internal implementation details.
Proposed fix
+import logging
+
+logger = logging.getLogger(__name__)
+
except Exception as e:
+ logger.exception("Error processing chat completion request")
last_content = ""
try:
msg = GlobalTypeConverter.get().convert(chat_request_or_message, to_type=ChatRequest)
if msg.messages:
last = msg.messages[-1].content
last_content = last if isinstance(last, str) else str(last)
- except Exception:
- pass
+ except Exception:
+ logger.debug("Could not extract last message content for error reporting")
return (
f"I apologize, but I encountered an error while processing your "
- f"query: '{last_content}'. Please try rephrasing your question or try "
- f"again later. Error: {str(e)}"
+ f"query: '{last_content}'. Please try rephrasing your question or try again later."
)Removing the raw exception from the user-facing message prevents potential information leakage while the logged exception preserves full debugging context.
🧰 Tools
🪛 Ruff (0.15.2)
[warning] 118-118: Do not catch blind exception: Exception
(BLE001)
[error] 125-126: try-except-pass detected, consider logging the exception
(S110)
[warning] 125-125: Do not catch blind exception: Exception
(BLE001)
[warning] 130-130: Use explicit conversion flag
Replace with conversion flag
(RUF010)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py` around lines 118 -
131, The except block currently swallows errors, fails to log stack traces, and
returns the raw exception text to the user; update the handler to log the full
exception via logger.exception(...) when the outer Exception is caught and also
remove the use of str(e) in the returned user-facing message, returning a
generic error string instead; replace the silent nested try-except around
GlobalTypeConverter.get().convert(...) with a logged except (e.g., log
conversion errors with logger.exception(...) referencing
GlobalTypeConverter.get().convert and the local variables msg/last_content) so
conversion failures aren’t silently dropped, and ensure the final return uses
only a non-sensitive generic message that may include last_content but not the
raw exception details.
willkill07
left a comment
There was a problem hiding this comment.
Approved. do we want this rebased+retargeted for release/1.5 ?
Description
This update enhances the overall user experience by enabling more dynamic and contextually relevant interactions.
By Submitting this PR I confirm:
Summary by CodeRabbit
Release Notes
New Features
Improvements