Skip to content

Enhance chat completion functionality to support OpenAI-style message history#1674

Open
mdemoret-nv wants to merge 1 commit intoNVIDIA:developfrom
mdemoret-nv:mdd_chat-completion-message-history
Open

Enhance chat completion functionality to support OpenAI-style message history#1674
mdemoret-nv wants to merge 1 commit intoNVIDIA:developfrom
mdemoret-nv:mdd_chat-completion-message-history

Conversation

@mdemoret-nv
Copy link
Collaborator

@mdemoret-nv mdemoret-nv commented Feb 26, 2026

Description

  • Updated the chat completion function to accept both single input messages and full conversation histories, allowing for context-aware responses.
  • Introduced a utility function to convert NAT messages to LangChain messages, prepending a system prompt if necessary.
  • Improved error handling to provide more informative responses in case of failures.
  • Added usage tracking for API compatibility, including token counts for prompts and completions.

This update enhances the overall user experience by enabling more dynamic and contextually relevant interactions.

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for OpenAI-style chat history in chat completion requests.
    • Chat completion now returns structured responses with detailed token usage statistics (prompt tokens, completion tokens, and total tokens).
    • Chat completion can now handle both single input messages and full conversation histories seamlessly.
  • Improvements

    • Enhanced error handling to provide more contextual error messages by preserving conversation context.

… history

- Updated the chat completion function to accept both single input messages and full conversation histories, allowing for context-aware responses.
- Introduced a utility function to convert NAT messages to LangChain messages, prepending a system prompt if necessary.
- Improved error handling to provide more informative responses in case of failures.
- Added usage tracking for API compatibility, including token counts for prompts and completions.

This update enhances the overall user experience by enabling more dynamic and contextually relevant interactions.

Signed-off-by: Michael Demoret <mdemoret@nvidia.com>
@mdemoret-nv mdemoret-nv requested a review from a team as a code owner February 26, 2026 20:36
@mdemoret-nv mdemoret-nv added bug Something isn't working non-breaking Non-breaking change labels Feb 26, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

Walkthrough

The chat completion function now supports OpenAI-style chat history. A new internal helper converts NAT messages to LangChain format. The handler signature changed to accept ChatRequest or single messages and returns structured ChatResponse objects with usage statistics instead of simple strings.

Changes

Cohort / File(s) Summary
Chat Completion Enhancement
packages/nvidia_nat_core/src/nat/tool/chat_completion.py
Added OpenAI-style chat history support with _messages_to_langchain_messages helper for message conversion. Modified handler signature from async def _chat_completion(query: str) -> str to `async def _chat_completion(chat_request_or_message: ChatRequestOrMessage) -> ChatResponse

Sequence Diagram

sequenceDiagram
    participant Client
    participant Handler as _chat_completion
    participant Converter as _messages_to_langchain_messages
    participant LLM as LangChain LLM
    participant Response as ChatResponse Builder

    alt Chat History Input
        Client->>Handler: ChatRequest with messages
        Handler->>Converter: NAT messages + system_prompt
        Converter->>Converter: Convert to LangChain format
        Converter-->>Handler: LangChain messages
    else Single Message Input
        Client->>Handler: String message
        Handler->>Handler: Convert to ChatRequest
        Handler->>Converter: Message + system_prompt
        Converter-->>Handler: LangChain messages
    end

    Handler->>LLM: Invoke with messages
    LLM-->>Handler: Raw response + token counts
    Handler->>Response: Calculate usage stats
    Response-->>Handler: ChatResponse object
    Handler-->>Client: ChatResponse or str
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change (enhancing chat completion to support OpenAI-style message history), is concise and descriptive, and uses imperative mood.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
packages/nvidia_nat_core/src/nat/tool/chat_completion.py (1)

49-60: Add return type hint and use more specific type annotation for nat_messages.

The function is missing a return type annotation, and nat_messages: list is overly generic. Per coding guidelines, all functions require type hints and should prefer typing abstractions.

Additionally, consider using iterable unpacking (Ruff RUF005) for cleaner list construction.

Proposed fix
+from collections.abc import Sequence
+
+from langchain_core.messages import BaseMessage
+from nat.data_models.api_server import Message
+
 def _messages_to_langchain_messages(
-    nat_messages: list,
+    nat_messages: Sequence[Message],
     system_prompt: str,
-):
+) -> list[BaseMessage]:
     """Convert NAT Message list to LangChain BaseMessage list with system prompt prepended if needed."""
     from langchain_core.messages.utils import convert_to_messages

     message_dicts = [m.model_dump() for m in nat_messages]
     has_system = any(d.get("role") == "system" for d in message_dicts)
     if not has_system and system_prompt:
-        message_dicts = [{"role": "system", "content": system_prompt}] + message_dicts
+        message_dicts = [{"role": "system", "content": system_prompt}, *message_dicts]
     return convert_to_messages(message_dicts)

Note: The Message import may already be available or need to be added. The BaseMessage import can stay inside the function if preferred to avoid top-level langchain dependency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py` around lines 49 -
60, Add proper type hints and tighten the nat_messages type: change the
parameter nat_messages: list to nat_messages: Iterable[Message] (import Message
from langchain_core.messages or the appropriate module) and add a return type of
list[BaseMessage] (import BaseMessage or keep it as a local import). Inside
_messages_to_langchain_messages, build the final message_dicts using iterable
unpacking instead of concatenation when prepending the system prompt (e.g.,
[{"role":"system","content": system_prompt}, *message_dicts]) and keep the
convert_to_messages call to produce the final List[BaseMessage]; ensure
necessary imports for Message and BaseMessage are added or referenced locally in
the function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py`:
- Around line 118-131: The except block currently swallows errors, fails to log
stack traces, and returns the raw exception text to the user; update the handler
to log the full exception via logger.exception(...) when the outer Exception is
caught and also remove the use of str(e) in the returned user-facing message,
returning a generic error string instead; replace the silent nested try-except
around GlobalTypeConverter.get().convert(...) with a logged except (e.g., log
conversion errors with logger.exception(...) referencing
GlobalTypeConverter.get().convert and the local variables msg/last_content) so
conversion failures aren’t silently dropped, and ensure the final return uses
only a non-sensitive generic message that may include last_content but not the
raw exception details.
- Around line 96-99: The current extraction uses hasattr(response, "text") and
response.text(), which is incorrect for LangChain BaseMessage; update the logic
in the chat completion handler where response and output_text are set (in
packages/nvidia_nat_core/src/nat/tool/chat_completion.py) to read from
response.content instead: replace the branch that checks for response.text()
with code that assigns output_text = response.content if it's already a string
or serializes it (e.g., str(response.content)) otherwise, ensuring you still
handle the case where response is a plain str by keeping the existing
isinstance(response, str) check.

---

Nitpick comments:
In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py`:
- Around line 49-60: Add proper type hints and tighten the nat_messages type:
change the parameter nat_messages: list to nat_messages: Iterable[Message]
(import Message from langchain_core.messages or the appropriate module) and add
a return type of list[BaseMessage] (import BaseMessage or keep it as a local
import). Inside _messages_to_langchain_messages, build the final message_dicts
using iterable unpacking instead of concatenation when prepending the system
prompt (e.g., [{"role":"system","content": system_prompt}, *message_dicts]) and
keep the convert_to_messages call to produce the final List[BaseMessage]; ensure
necessary imports for Message and BaseMessage are added or referenced locally in
the function.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c2d9703 and e83ee19.

📒 Files selected for processing (1)
  • packages/nvidia_nat_core/src/nat/tool/chat_completion.py

Comment on lines +96 to +99
if isinstance(response, str):
output_text = response
else:
output_text = response.text() if hasattr(response, "text") else str(response.content)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, examine the actual code in the file to understand context and imports
head -120 packages/nvidia_nat_core/src/nat/tool/chat_completion.py | tail -60

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 2694


🏁 Script executed:

# Check imports to understand what types are being used
head -30 packages/nvidia_nat_core/src/nat/tool/chat_completion.py

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 1351


🏁 Script executed:

# Search for how LangChain message content is accessed in the codebase
rg -n 'BaseMessage|\.content|\.text\(' packages/nvidia_nat_core/src/nat/ --type=py | head -30

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 3934


🏁 Script executed:

# Check if there are any LangChain related files that show message handling patterns
find packages -name "*.py" -type f -exec rg -l "BaseMessage" {} \; | head -10

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 898


🌐 Web query:

LangChain BaseMessage content property API documentation

💡 Result:

  • Python (langchain_core.messages.base.BaseMessage): content is a required field (str | list[str | dict]) holding the message payload; BaseMessage is the abstract base for HumanMessage, AIMessage, etc. See the API doc for BaseMessage (params/attributes include content, additional_kwargs, response_metadata, type, etc.). [1]

  • JavaScript/TypeScript (@langchain/core messages): message classes (e.g., AIMessage) inherit from BaseMessage and expose a content property whose type supports either plain text or structured content blocks (LangChain v1 message content blocks). See AIMessage (shows content type) and the Messages docs (explains what content can contain). [2], [3]

Sources: [1] [2] [3]


Simplify message content extraction to use LangChain's content property directly.

LangChain's BaseMessage uses a content property (str | list), not a text() method. The hasattr(response, "text") check will always be False for LangChain messages. Simplify line 99 to: output_text = response.content if isinstance(response.content, str) else str(response.content) (or simply output_text = response.content if content is guaranteed to be a string).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py` around lines 96 -
99, The current extraction uses hasattr(response, "text") and response.text(),
which is incorrect for LangChain BaseMessage; update the logic in the chat
completion handler where response and output_text are set (in
packages/nvidia_nat_core/src/nat/tool/chat_completion.py) to read from
response.content instead: replace the branch that checks for response.text()
with code that assigns output_text = response.content if it's already a string
or serializes it (e.g., str(response.content)) otherwise, ensuring you still
handle the case where response is a plain str by keeping the existing
isinstance(response, str) check.

Comment on lines 118 to +131
except Exception as e:
# Fallback response if LLM call fails
return (f"I apologize, but I encountered an error while processing your "
f"query: '{query}'. Please try rephrasing your question or try "
f"again later. Error: {str(e)}")

yield _chat_completion
last_content = ""
try:
msg = GlobalTypeConverter.get().convert(chat_request_or_message, to_type=ChatRequest)
if msg.messages:
last = msg.messages[-1].content
last_content = last if isinstance(last, str) else str(last)
except Exception:
pass
return (
f"I apologize, but I encountered an error while processing your "
f"query: '{last_content}'. Please try rephrasing your question or try "
f"again later. Error: {str(e)}"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add logging for caught exceptions and avoid exposing raw error details to users.

The exception handling has several issues:

  1. Missing logging: Per coding guidelines, when catching exceptions without re-raising, use logger.exception() to capture the full stack trace.
  2. Silent exception swallowing (lines 125-126): The nested try-except-pass silently discards errors, making debugging difficult (Ruff S110).
  3. Information leakage: Exposing str(e) in user-facing messages may reveal internal implementation details.
Proposed fix
+import logging
+
+logger = logging.getLogger(__name__)
+
         except Exception as e:
+            logger.exception("Error processing chat completion request")
             last_content = ""
             try:
                 msg = GlobalTypeConverter.get().convert(chat_request_or_message, to_type=ChatRequest)
                 if msg.messages:
                     last = msg.messages[-1].content
                     last_content = last if isinstance(last, str) else str(last)
-            except Exception:
-                pass
+            except Exception:
+                logger.debug("Could not extract last message content for error reporting")
             return (
                 f"I apologize, but I encountered an error while processing your "
-                f"query: '{last_content}'. Please try rephrasing your question or try "
-                f"again later. Error: {str(e)}"
+                f"query: '{last_content}'. Please try rephrasing your question or try again later."
             )

Removing the raw exception from the user-facing message prevents potential information leakage while the logged exception preserves full debugging context.

🧰 Tools
🪛 Ruff (0.15.2)

[warning] 118-118: Do not catch blind exception: Exception

(BLE001)


[error] 125-126: try-except-pass detected, consider logging the exception

(S110)


[warning] 125-125: Do not catch blind exception: Exception

(BLE001)


[warning] 130-130: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py` around lines 118 -
131, The except block currently swallows errors, fails to log stack traces, and
returns the raw exception text to the user; update the handler to log the full
exception via logger.exception(...) when the outer Exception is caught and also
remove the use of str(e) in the returned user-facing message, returning a
generic error string instead; replace the silent nested try-except around
GlobalTypeConverter.get().convert(...) with a logged except (e.g., log
conversion errors with logger.exception(...) referencing
GlobalTypeConverter.get().convert and the local variables msg/last_content) so
conversion failures aren’t silently dropped, and ensure the final return uses
only a non-sensitive generic message that may include last_content but not the
raw exception details.

Copy link
Member

@willkill07 willkill07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. do we want this rebased+retargeted for release/1.5 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants