Skip to content

feat(browser/langgraph): Add browser LangGraph tests with variant options#87

Open
priscilawebdev wants to merge 12 commits intomainfrom
priscilawebdev/feat/browser-langgraph-variants
Open

feat(browser/langgraph): Add browser LangGraph tests with variant options#87
priscilawebdev wants to merge 12 commits intomainfrom
priscilawebdev/feat/browser-langgraph-variants

Conversation

@priscilawebdev
Copy link
Member

@priscilawebdev priscilawebdev commented Mar 5, 2026

Add browser LangGraph tests covering five instrumentation patterns, using the generic options system to keep everything in a single langgraph framework folder.

A variant option expands the test matrix. All variants run both streaming and blocking. Use --option variant=<name> to filter.

Variants

Variant Sentry API Known issue
graph instrumentLangGraph() only Streaming: no invoke_agent span
langchain createLangChainCallbackHandler() only Callback handler doesn't set gen_ai.agent.name / gen_ai.operation.name
combined Both APIs together Chat spans dropped intermittently; duplicate invoke_agent spans
compiled instrumentLangGraph() on compiled graph Crashes with TypeError
custom-state instrumentLangGraph() with custom state recordInputs/recordOutputs silently records nothing

Other changes

  • checkAgentSpanAttributes now validates gen_ai.agent.name matches the expected name from the test definition (not just existence), catching regressions like the unknown_chain bug

Closes https://linear.app/getsentry/issue/TET-1946/ai-testing-framework-dogfood-langgraph-in-a-browser-runtime

priscilawebdev and others added 2 commits March 5, 2026 11:22
Introduces five isolated browser test frameworks for LangGraph, each
targeting a specific known Sentry SDK instrumentation bug. Splitting
into separate framework folders means each bug is independently
observable and its fix can be validated in isolation.

## Frameworks added

### langgraph (instrumentLangGraph only)
Uses StateGraph(MessagesAnnotation) + Sentry.instrumentLangGraph() before
compile(). Blocking invoke() works and produces an invoke_agent span with
input/output messages. Streaming (stream()) is not patched — no spans
are created (Bug 3). streamingMode: "both" to surface this difference.

### langgraph-langchain (createLangChainCallbackHandler only)
Uses StateGraph + createLangChainCallbackHandler passed to
compiledGraph.invoke(). This triggers handleChainStart for each LangGraph
node, which previously produced spans named "unknown_chain" instead of
the actual node name (Bug 2, fixed in sentry-javascript#19554). No
invoke_agent span is created since instrumentLangGraph is not used.

### langgraph-combined (both APIs together)
Uses both instrumentLangGraph and createLangChainCallbackHandler together.
Combined use causes chat spans to be orphaned (not nested inside
invoke_agent) and missing input/output messages. Spurious invoke_agent
sub-spans also appear (Bug 4).

### langgraph-compiled (instrumentLangGraph on compiled graph)
Uses createReactAgent (which returns an already-compiled graph) and then
calls instrumentLangGraph on the result. This crashes with a TypeError
because instrumentLangGraph cannot patch a graph that is already compiled
(Bug 1). Mirrors the pattern shown in the official Sentry docs.

### langgraph-custom-state (Annotation.Root custom state)
Uses StateGraph with Annotation.Root instead of MessagesAnnotation.
instrumentLangGraph runs without error and invoke_agent spans are created,
but recordInputs/recordOutputs silently records nothing because the state
has no "messages" key (Bug 5).

## Supporting changes

- Add checkAgentInputOutputMessages check: validates that invoke_agent
  spans carry gen_ai.input.messages and gen_ai.output.messages when using
  instrumentLangGraph. Skips for non-instrumentLangGraph frameworks. Fails
  for langgraph-custom-state to surface Bug 5.
- Fix agents/node/langgraph/config.json: add missing sentryVersions field
  that was causing the framework to be silently skipped by discovery.
- Update CLAUDE.md and templates/README.md with browser variant docs and
  supported SDK table entries.

Co-Authored-By: Claude <noreply@anthropic.com>
The Sentry JavaScript SDK had a bug in createLangChainCallbackHandler
where handleChainStart only read 4 of the 8 parameters passed by
LangChain. The 8th argument (runName) carries the actual LangGraph node
name, but was never reached. As a result, every chain span fell back to
"unknown_chain" regardless of the actual node name (Bug 2).

This was fixed in sentry-javascript#19554 by reading runName as the
first fallback before chain.name.

## Changes

### Fix langgraph-langchain template
The callback handler was previously passed to llm.invoke() rather than
compiledGraph.invoke(). This meant handleChainStart was never triggered
for LangGraph nodes, so Bug 2 was not observable at all. Moving the
callback to compiledGraph.invoke() causes handleChainStart to fire for
each node (e.g. "agent", "__start__"), which is the correct usage and
the pattern that surfaced the original bug.

### Add checkLangChainNodeNames check
New check that only runs for langgraph-langchain (skipIf for all other
frameworks). Finds invoke_agent spans with a langchain.chain.name
attribute (created by handleChainStart) and asserts none of them carry
the value "unknown_chain". This directly validates the fix in #19554
and will catch any regression.

Note: handleChainStart creates spans with op="gen_ai.invoke_agent" but
without gen_ai.operation.name in data, so the check uses a direct op
filter rather than findAgentSpans() which relies on that attribute.

The check is added to all six agent test cases so that any regression
is caught regardless of which test runs.

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 5, 2026

🔴 AI SDK Integration Test Results

Status: 2 regressions detected

Summary

Metric main PR Change
Total Tests 448 514 +66
Passed 230 229 -1 ⚠️
Failed 218 272 +54 ⚠️

🔴 Regressions

These tests were passing on main but are now failing:

cloudflare/anthropic :: Basic LLM Test (blocking)

Error: Test execution failed: Wrangler exited with code 1

Test execution failed: Wrangler exited with code 1
stdout: 
 ⛅️ wrangler 4.71.0
───────────────────
Using secrets defined in .dev.vars
Your Worker has access to the following bindings:
Binding                                                                   Resource                  Mode
env.SENTRY_DSN ("http://public@localhost:43417/122874768")                Environment Variable      local
env.RUN_ID ("run-1773140003246-uwgcpwx")                                  Environment Variable      local
env.OPENAI_API_KEY ("(hidden)")                                           Environment Variable      local
env.ANTHROPIC_API_KEY ("(hidden)")                                        Environment Variable      local
env.GOOGLE_GENAI_API_KEY ("(hidden)")                                     Environment Variable      local

*** Fatal uncaught kj::Exception: kj/async-io-unix.c++:945: failed: ::bind(sockfd, &addr.generic, addrlen): Address already in use; toString() = 127.0.0.1:9229
stack: /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f0a176 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f09f49 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f0862c /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1f44fcc /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1f459c8 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1f46381 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1f47b3d /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1ead534 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f3a9ff /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f3ae50 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f38cd5 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f38aeb /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1e96cb5 /lib/x86_64-linux-gnu/libc.so.6@2a1c9 /lib/x86_64-linux-gnu/libc.so.6@2a28a /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/anthropic-0.39.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1e96024


stderr: �[31m✘ �[41;31m[�[41;97mERROR�[41;31m]�[0m �[1mAddress already in use (127.0.0.1:9229). Please check that you are not already running a server on this address or specify a different port with --port.�[0m


🪵  Logs were written to "/home/runner/.config/.wrangler/logs/wrangler-2026-03-10_11-05-26_467.log"

cloudflare/google-genai :: Vision LLM Test (blocking)

Error: Test execution failed: Wrangler exited with code 1

Test execution failed: Wrangler exited with code 1
stdout: 
 ⛅️ wrangler 4.71.0
───────────────────
Using secrets defined in .dev.vars
Your Worker has access to the following bindings:
Binding                                                                    Resource                  Mode
env.SENTRY_DSN ("http://public@localhost:43417/2064454...")                Environment Variable      local
env.RUN_ID ("run-1773140003246-f9wved0")                                   Environment Variable      local
env.OPENAI_API_KEY ("(hidden)")                                            Environment Variable      local
env.ANTHROPIC_API_KEY ("(hidden)")                                         Environment Variable      local
env.GOOGLE_GENAI_API_KEY ("(hidden)")                                      Environment Variable      local

*** Fatal uncaught kj::Exception: kj/async-io-unix.c++:945: failed: ::bind(sockfd, &addr.generic, addrlen): Address already in use; toString() = 127.0.0.1:9229
stack: /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f0a176 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f09f49 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f0862c /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1f44fcc /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1f459c8 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1f46381 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1f47b3d /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1ead534 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f3a9ff /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f3ae50 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f38cd5 /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@4f38aeb /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1e96cb5 /lib/x86_64-linux-gnu/libc.so.6@2a1c9 /lib/x86_64-linux-gnu/libc.so.6@2a28a /home/runner/work/testing-ai-sdk-integrations/testing-ai-sdk-integrations/runs/cloudflare/google-genai-1.38.0-sentry-latest/node_modules/@cloudflare/workerd-linux-64/bin/workerd@1e96024


stderr: �[31m✘ �[41;31m[�[41;97mERROR�[41;31m]�[0m �[1mAddress already in use (127.0.0.1:9229). Please check that you are not already running a server on this address or specify a different port with --port.�[0m


🪵  Logs were written to "/home/runner/.config/.wrangler/logs/wrangler-2026-03-10_11-05-49_509.log"

✅ Fixed

These tests were failing on main but are now passing:

  • cloudflare/google-genai :: Basic LLM Test (streaming)

🆕 New Tests

Failing (66):

browser/langgraph :: Basic Agent Test (streaming, graph)

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one chat/completion span
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Basic Agent Test (streaming, langchain)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat or agent span
browser/langgraph :: Basic Agent Test (streaming, combined)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Basic Agent Test (streaming, compiled)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one AI span
Should have at least one chat or agent span
browser/langgraph :: Basic Agent Test (streaming, custom-state)

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one chat/completion span
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Basic Agent Test (blocking, graph)

Error: 1 check(s) failed:

1 check(s) failed:
Should have at least one chat/completion span
browser/langgraph :: Basic Agent Test (blocking, langchain)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat or agent span
browser/langgraph :: Basic Agent Test (blocking, combined)

Error: 3 check(s) failed:

3 check(s) failed:
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
browser/langgraph :: Basic Agent Test (blocking, compiled)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one AI span
Should have at least one chat or agent span
browser/langgraph :: Basic Agent Test (blocking, custom-state)

Error: 3 check(s) failed:

3 check(s) failed:
Should have at least one chat/completion span
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
gen_ai.input.messages should not be empty
browser/langgraph :: Tool Call Agent Test (streaming, graph)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Tool Call Agent Test (streaming, langchain)

Error: 9 check(s) failed:

9 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one tool span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
Should have at least one chat or agent span
browser/langgraph :: Tool Call Agent Test (streaming, combined)

Error: 8 check(s) failed:

8 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Tool Call Agent Test (streaming, compiled)

Error: 8 check(s) failed:

8 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one AI span
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
Should have at least one chat or agent span
browser/langgraph :: Tool Call Agent Test (streaming, custom-state)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Tool Call Agent Test (blocking, graph)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
browser/langgraph :: Tool Call Agent Test (blocking, langchain)

Error: 9 check(s) failed:

9 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one tool span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
Should have at least one chat or agent span
browser/langgraph :: Tool Call Agent Test (blocking, combined)

Error: 7 check(s) failed:

7 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
browser/langgraph :: Tool Call Agent Test (blocking, compiled)

Error: 8 check(s) failed:

8 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one AI span
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
Should have at least one chat or agent span
browser/langgraph :: Tool Call Agent Test (blocking, custom-state)

Error: 7 check(s) failed:

7 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat span
Should have at least one chat span
Should have at least 2 tool span(s) but found 0
gen_ai.input.messages should not be empty
browser/langgraph :: Tool Error Agent Test (streaming, graph)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one chat span
Should have at least one chat span
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
Should have at least one tool span but found none
browser/langgraph :: Tool Error Agent Test (streaming, langchain)

Error: 8 check(s) failed:

8 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one tool span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Should have at least one chat span
Should have at least one chat span
Should have at least one chat or agent span
Should have at least one tool span but found none
browser/langgraph :: Tool Error Agent Test (streaming, combined)

Error: 7 check(s) failed:

7 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Should have at least one chat span
Should have at least one chat span
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
Should have at least one tool span but found none
browser/langgraph :: Tool Error Agent Test (streaming, compiled)

Error: 8 check(s) failed:

8 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one AI span
Should have at least one chat span
Should have at least one chat span
Should have at least one chat or agent span
Should have at least one tool span but found none
browser/langgraph :: Tool Error Agent Test (streaming, custom-state)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one chat span
Should have at least one chat span
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
Should have at least one tool span but found none
browser/langgraph :: Tool Error Agent Test (blocking, graph)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one chat span
Should have at least one chat span
Should have at least one tool span but found none
browser/langgraph :: Tool Error Agent Test (blocking, langchain)

Error: 8 check(s) failed:

8 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one tool span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Should have at least one chat span
Should have at least one chat span
Should have at least one chat or agent span
Should have at least one tool span but found none
browser/langgraph :: Tool Error Agent Test (blocking, combined)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Should have at least one chat span
Should have at least one chat span
Should have at least one tool span but found none
browser/langgraph :: Tool Error Agent Test (blocking, compiled)

Error: 8 check(s) failed:

8 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one AI span
Should have at least one chat span
Should have at least one chat span
Should have at least one chat or agent span
Should have at least one tool span but found none
browser/langgraph :: Tool Error Agent Test (blocking, custom-state)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one chat/completion span
Should have at least one tool span
Should have at least one chat span
Should have at least one chat span
gen_ai.input.messages should not be empty
Should have at least one tool span but found none
browser/langgraph :: Vision Agent Test (streaming, graph)

Error: 3 check(s) failed:

3 check(s) failed:
Should have at least one chat/completion span
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
Should have at least one span with messages attribute
browser/langgraph :: Vision Agent Test (streaming, langchain)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat or agent span
Should have at least one chat or agent span
browser/langgraph :: Vision Agent Test (streaming, combined)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
Should have at least one span with messages attribute
browser/langgraph :: Vision Agent Test (streaming, compiled)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one AI span
Should have at least one chat or agent span
Should have at least one chat or agent span
browser/langgraph :: Vision Agent Test (streaming, custom-state)

Error: 3 check(s) failed:

3 check(s) failed:
Should have at least one chat/completion span
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
Should have at least one span with messages attribute
browser/langgraph :: Vision Agent Test (blocking, graph)

Error: 1 check(s) failed:

1 check(s) failed:
Should have at least one chat/completion span
browser/langgraph :: Vision Agent Test (blocking, langchain)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat or agent span
Should have at least one chat or agent span
browser/langgraph :: Vision Agent Test (blocking, combined)

Error: 3 check(s) failed:

3 check(s) failed:
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
browser/langgraph :: Vision Agent Test (blocking, compiled)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one AI span
Should have at least one chat or agent span
Should have at least one chat or agent span
browser/langgraph :: Vision Agent Test (blocking, custom-state)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one chat/completion span
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
gen_ai.input.messages should not be empty
Messages should contain '[Blob substitute]' marker indicating binary content was redacted
browser/langgraph :: Long Input Agent Test (streaming, graph)

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one chat/completion span
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Long Input Agent Test (streaming, langchain)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Should have at least one chat or agent span
browser/langgraph :: Long Input Agent Test (streaming, combined)

Error: 3 check(s) failed:

3 check(s) failed:
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Long Input Agent Test (streaming, compiled)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one AI span
Should have at least one chat or agent span
browser/langgraph :: Long Input Agent Test (streaming, custom-state)

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one chat/completion span
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Long Input Agent Test (blocking, graph)

Error: 1 check(s) failed:

1 check(s) failed:
Should have at least one chat/completion span
browser/langgraph :: Long Input Agent Test (blocking, langchain)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Should have at least one chat or agent span
browser/langgraph :: Long Input Agent Test (blocking, combined)

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
browser/langgraph :: Long Input Agent Test (blocking, compiled)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one AI span
Should have at least one chat or agent span
browser/langgraph :: Long Input Agent Test (blocking, custom-state)

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one chat/completion span
gen_ai.input.messages should not be empty
browser/langgraph :: Conversation ID Agent Test (streaming, graph)

Error: 3 check(s) failed:

3 check(s) failed:
Should have at least one chat/completion span
Expected at least 4 chat spans for conversation ID validation, got 0
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Conversation ID Agent Test (streaming, langchain)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Expected at least 4 chat spans for conversation ID validation, got 0
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat or agent span
browser/langgraph :: Conversation ID Agent Test (streaming, combined)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Expected at least 4 chat spans for conversation ID validation, got 0
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Conversation ID Agent Test (streaming, compiled)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one AI span
Should have at least one AI span
Should have at least one chat or agent span
browser/langgraph :: Conversation ID Agent Test (streaming, custom-state)

Error: 3 check(s) failed:

3 check(s) failed:
Should have at least one chat/completion span
Expected at least 4 chat spans for conversation ID validation, got 0
Should have at least one span with gen_ai.input.messages or gen_ai.request.messages
browser/langgraph :: Conversation ID Agent Test (blocking, graph)

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one chat/completion span
Expected at least 4 chat spans for conversation ID validation, got 0
browser/langgraph :: Conversation ID Agent Test (blocking, langchain)

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Expected at least 4 chat spans for conversation ID validation, got 0
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
Should have at least one chat or agent span
browser/langgraph :: Conversation ID Agent Test (blocking, combined)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one chat/completion span
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Agent span (gen_ai.invoke_agent) should have gen_ai.agent.name attribute
Expected at least 4 chat spans for conversation ID validation, got 0
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
browser/langgraph :: Conversation ID Agent Test (blocking, compiled)

Error: 5 check(s) failed:

5 check(s) failed:
Should have at least one agent span
Should have at least one chat/completion span
Should have at least one AI span
Should have at least one AI span
Should have at least one chat or agent span
browser/langgraph :: Conversation ID Agent Test (blocking, custom-state)

Error: 4 check(s) failed:

4 check(s) failed:
Should have at least one chat/completion span
Expected at least 4 chat spans for conversation ID validation, got 0
Token usage validation failed:
  input_tokens must exist
  output_tokens must exist
  total_tokens must exist
gen_ai.input.messages should not be empty
gen_ai.input.messages should not be empty
gen_ai.input.messages should not be empty
gen_ai.input.messages should not be empty
node/langgraph :: Basic Agent Test

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one agent span
Should have at least one agent span
gen_ai.response.model is missing (optional but recommended)
node/langgraph :: Tool Call Agent Test

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one agent span
Should have at least one tool span
Should have at least one agent span
Should have a chat span with gen_ai.tool.definitions or gen_ai.request.available_tools
Should have at least 2 tool call(s) in response but found 0
Response should include tool call for "add" (mapped from "add")
Response should include tool call for "multiply" (mapped from "multiply")
Should have at least 2 tool span(s) but found 0
gen_ai.response.model is missing (optional but recommended)
node/langgraph :: Tool Error Agent Test

Error: 6 check(s) failed:

6 check(s) failed:
Should have at least one agent span
Should have at least one tool span
Should have at least one agent span
Should have a chat span with gen_ai.tool.definitions or gen_ai.request.available_tools
Should have at least 1 tool call(s) in response but found 0
Response should include tool call for "read_file" (mapped from "read_file")
Should have at least one tool span but found none
gen_ai.response.model is missing (optional but recommended)
node/langgraph :: Vision Agent Test

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one agent span
Should have at least one agent span
gen_ai.response.model is missing (optional but recommended)
node/langgraph :: Long Input Agent Test

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one agent span
Should have at least one agent span
gen_ai.response.model is missing (optional but recommended)
node/langgraph :: Conversation ID Agent Test

Error: 2 check(s) failed:

2 check(s) failed:
Should have at least one agent span
Should have at least one agent span
gen_ai.response.model is missing (optional but recommended)
gen_ai.response.model is missing (optional but recommended)
gen_ai.response.model is missing (optional but recommended)
gen_ai.response.model is missing (optional but recommended)

Test Matrix

Agent Tests

SDK Basic Agent Test Conversation ID Agent Test Long Input Agent Test Tool Call Agent Test Tool Error Agent Test Vision Agent Test
browser/langgraph ❌🆕blk, combined ❌🆕blk, compiled ❌🆕blk, custom-state ❌🆕blk, graph ❌🆕blk, langchain ❌🆕str, combined ❌🆕str, compiled ❌🆕str, custom-state ❌🆕str, graph ❌🆕str, langchain ❌🆕blk, combined ❌🆕blk, compiled ❌🆕blk, custom-state ❌🆕blk, graph ❌🆕blk, langchain ❌🆕str, combined ❌🆕str, compiled ❌🆕str, custom-state ❌🆕str, graph ❌🆕str, langchain ❌🆕blk, combined ❌🆕blk, compiled ❌🆕blk, custom-state ❌🆕blk, graph ❌🆕blk, langchain ❌🆕str, combined ❌🆕str, compiled ❌🆕str, custom-state ❌🆕str, graph ❌🆕str, langchain ❌🆕blk, combined ❌🆕blk, compiled ❌🆕blk, custom-state ❌🆕blk, graph ❌🆕blk, langchain ❌🆕str, combined ❌🆕str, compiled ❌🆕str, custom-state ❌🆕str, graph ❌🆕str, langchain ❌🆕blk, combined ❌🆕blk, compiled ❌🆕blk, custom-state ❌🆕blk, graph ❌🆕blk, langchain ❌🆕str, combined ❌🆕str, compiled ❌🆕str, custom-state ❌🆕str, graph ❌🆕str, langchain ❌🆕blk, combined ❌🆕blk, compiled ❌🆕blk, custom-state ❌🆕blk, graph ❌🆕blk, langchain ❌🆕str, combined ❌🆕str, compiled ❌🆕str, custom-state ❌🆕str, graph ❌🆕str, langchain
cloudflare/langgraph
cloudflare/vercel
nextjs/mastra
nextjs/vercel blkstr blkstr blkstr blkstr blkstr blkstr
node/langgraph ❌🆕 ❌🆕 ❌🆕 ❌🆕 ❌🆕 ❌🆕
node/manual
node/mastra
node/vercel
php/laravel blkstr blkstr blkstr blkstr blkstr blkstr
python/langgraph as as as as as as
python/manual as as as as as as
python/openai-agents
python/pydantic-ai

Embedding Tests

SDK Basic Embeddings Test
browser/google-genai
browser/langchain
browser/openai
cloudflare/google-genai
cloudflare/langchain
cloudflare/openai
cloudflare/vercel
nextjs/google-genai
nextjs/langchain
nextjs/openai
nextjs/vercel
node/google-genai
node/langchain
node/openai
node/vercel
php/laravel
python/google-genai a, blks, blk
python/langchain a, blks, blk
python/litellm a, blks, blk
python/manual a, blks, blk
python/openai a, blks, blk

LLM Tests

SDK Basic Error LLM Test Basic LLM Test Conversation ID LLM Test Long Input LLM Test Multi-Turn LLM Test Vision LLM Test
browser/anthropic blkstr blkstr blkstr blkstr blkstr blkstr
browser/google-genai blkstr blkstr blkstr blkstr blkstr blkstr
browser/langchain blkstr blkstr blkstr blkstr blkstr blkstr
browser/openai blkstr blkstr blkstr blkstr blkstr blkstr
cloudflare/anthropic blkstr ❌📉blkstr blkstr blkstr blkstr blkstr
cloudflare/google-genai blkstr blk ✅🔧str blkstr blkstr blkstr ❌📉blkstr
cloudflare/langchain blkstr blkstr blkstr blkstr blkstr blkstr
cloudflare/openai blkstr blkstr blkstr blkstr blkstr blkstr
nextjs/anthropic blkstr blkstr blkstr blkstr blkstr blkstr
nextjs/google-genai blkstr blkstr blkstr blkstr blkstr blkstr
nextjs/langchain blkstr blkstr blkstr blkstr blkstr blkstr
nextjs/openai blkstr blkstr blkstr blkstr blkstr blkstr
node/anthropic blkstr blkstr blkstr blkstr blkstr blkstr
node/google-genai blkstr blkstr blkstr blkstr blkstr blkstr
node/langchain blkstr blkstr blkstr blkstr blkstr blkstr
node/manual
node/openai blkstr blkstr blkstr blkstr blkstr blkstr
python/anthropic a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str
python/google-genai a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str
python/langchain a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str
python/litellm a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str
python/manual a, blks, blk a, blks, blk a, blks, blk a, blks, blk a, blks, blk
python/openai a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str a, blka, strs, blks, str

Legend: ✅ Pass | ❌ Fail | ✅🔧 Fixed | ❌📉 Regressed | ✅🆕 New (pass) | ❌🆕 New (fail) | 🗑️ Removed | str=streaming blk=blocking a=async s=sync hi=highlevel lo=lowlevel


Generated by AI SDK Integration Tests

priscilawebdev and others added 4 commits March 5, 2026 11:57
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
@priscilawebdev priscilawebdev marked this pull request as ready for review March 5, 2026 11:09
@linear-code
Copy link

linear-code bot commented Mar 5, 2026

@priscilawebdev priscilawebdev requested a review from a team March 5, 2026 11:10
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Custom-state template extracts system message, not user message
    • The custom-state template now selects the first message with role "user" (with fallback) before deriving userInput, so the intended user prompt is sent to the model.
  • ✅ Fixed: Combined template callback placement prevents full bug reproduction
    • The combined template now invokes the compiled graph with the callback handler and removes callbacks from llm.invoke, enabling graph-level callback events needed for the interference reproduction.

Create PR

Or push these changes by commenting:

@cursor push 6f72370ba1
Preview (6f72370ba1)
diff --git a/src/runner/templates/agents/browser/langgraph-combined/template.njk b/src/runner/templates/agents/browser/langgraph-combined/template.njk
--- a/src/runner/templates/agents/browser/langgraph-combined/template.njk
+++ b/src/runner/templates/agents/browser/langgraph-combined/template.njk
@@ -48,9 +48,9 @@
             apiKey: OPENAI_API_KEY,
           });
 
-          // Build graph with agent node — passes callback handler to llm.invoke()
+          // Build graph with agent node
           async function agentNode(state) {
-            const response = await llm.invoke(state.messages, { callbacks: [callbackHandler] });
+            const response = await llm.invoke(state.messages);
             return { messages: [response] };
           }
 
@@ -88,7 +88,10 @@
 
           try {
             log('Starting request {{ loop.index }}...');
-            const result = await compiledGraph.invoke({ messages: messages{{ loop.index }} });
+            const result = await compiledGraph.invoke(
+              { messages: messages{{ loop.index }} },
+              { callbacks: [callbackHandler] }
+            );
             const lastMessage = result.messages[result.messages.length - 1];
             log('Response {{ loop.index }}:', lastMessage.content);
           } catch (error) {

diff --git a/src/runner/templates/agents/browser/langgraph-custom-state/template.njk b/src/runner/templates/agents/browser/langgraph-custom-state/template.njk
--- a/src/runner/templates/agents/browser/langgraph-custom-state/template.njk
+++ b/src/runner/templates/agents/browser/langgraph-custom-state/template.njk
@@ -71,7 +71,11 @@
 {% endif %}
           // Request {{ loop.index }}{% if loop.length > 1 %} of {{ loop.length }}{% endif %}
           // Extract the first user message text as a plain string for the custom state
-          const userInput{{ loop.index }} = {% if input.messages[0].content is string %}"{{ input.messages[0].content }}"{% else %}"{{ input.messages[0].content[0].text }}"{% endif %};
+          const inputMessages{{ loop.index }} = {{ input.messages | dump }};
+          const userMessage{{ loop.index }} = inputMessages{{ loop.index }}.find((message) => message.role === "user") ?? inputMessages{{ loop.index }}[0];
+          const userInput{{ loop.index }} = typeof userMessage{{ loop.index }}.content === "string"
+            ? userMessage{{ loop.index }}.content
+            : (userMessage{{ loop.index }}.content.find((part) => part.type === "text")?.text ?? "");
 
           try {
             log('Starting request {{ loop.index }}...');
This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Move callbackHandler from llm.invoke() to graph.invoke() in
langgraph-combined template — LangGraph auto-propagates callbacks to
nested calls, so graph-level is the realistic user pattern.

Fix misleading comment in langgraph-custom-state that said "first user
message" when it actually extracts the first message regardless of role.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
priscilawebdev and others added 4 commits March 9, 2026 09:32
…plicate invoke_agent spans

Update CLAUDE.md and template comments to reflect observed behavior:
chat spans are dropped intermittently (not orphaned) and duplicate
invoke_agent spans are produced when both instrumentation APIs are active.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tputMessages

The filter used exact === "invoke_agent" but findAgentSpans uses a regex
that also matches "gen_ai.invoke_agent". If the SDK emits the prefixed
form, the filter would silently drop all spans and skipIf would skip
the entire check without any visible failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…h generic options

Replace five separate langgraph browser folders (langgraph, langgraph-langchain,
langgraph-combined, langgraph-compiled, langgraph-custom-state) with a single
langgraph folder using the generic options system. The `variant` option expands
the test matrix to cover all five instrumentation approaches.

All variants now run both streaming and blocking modes. Remove
integration-specific checks (checkAgentInputOutputMessages,
checkLangChainNodeNames) and instead validate gen_ai.agent.name matches the
expected name from the test definition in the generic checkAgentSpanAttributes.

Co-Authored-By: Claude <noreply@anthropic.com>
@priscilawebdev priscilawebdev changed the title feat(browser/langgraph): add five browser LangGraph variant frameworks and validate instrumentation feat(browser/langgraph): Add browser LangGraph tests with variant options Mar 10, 2026
Co-Authored-By: Claude <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

{% if variant == "compiled" %}
const llm = new langchainOpenAI.ChatOpenAI({
modelName: {% if causeAPIError %}"invalid-model"{% else %}"{{ inputs[0].model }}"{% endif %},
openAIApiKey: OPENAI_API_KEY,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiled variant uses deprecated openAIApiKey parameter name

Medium Severity

The compiled variant passes openAIApiKey to the ChatOpenAI constructor, while all other variants in the same template (lines 85 and 110) correctly use apiKey. The @langchain/openai package deprecated openAIApiKey in favor of apiKey. Depending on the version of @langchain/openai (1.2.12 per config.json), this could cause the API key to be silently ignored, leading to authentication failures that get misattributed to the known TypeError issue for this variant rather than the actual root cause.

Fix in Cursor Fix in Web

// Request {{ loop.index }}{% if loop.length > 1 %} of {{ loop.length }}{% endif %}

{% if variant == "custom-state" %}
const userInput{{ loop.index }} = {% if input.messages[0].content is string %}"{{ input.messages[0].content }}"{% else %}"{{ input.messages[0].content[0].text }}"{% endif %};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom-state variant sends system message as user input

Low Severity

The custom-state variant extracts input.messages[0].content as user input. For test cases like basicAgentTest and visionAgentTest, messages[0] is the system message (e.g., "You are a helpful assistant."), not the user message. The actual user question at messages[1] is silently discarded, and for the vision test the image content is completely lost. The template inconsistently picks the correct message depending on which test case it runs with.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant