Skip to content

JSON schema output for self-referencing pydantic models #2181

@bertrandkerres

Description

@bertrandkerres

Environment details

  • Programming language: python
  • OS: ubuntu-22 (but most likely others as well)
  • Language runtime version: 3.12.7
  • Package version: google-genai==1.60, pydantic>=2.0

Note:
I encountered it while using langchain-google-genai, where the

llm = ChatGoogleGenerativeAI(...)
pydantic_obj = llm.with_structured_output(SomePydanticClass).invoke(msg)

path hits the issue. But the bug seems to be in google-genai package.

Steps to reproduce

I had Claude write an MWE to recreate the issue. Run the following code with GOOGLE_API_KEY set:

"""
Minimal Working Example: mutual $ref recursion causes RecursionError in google-genai.

Tested with: google-genai==1.60.0, pydantic==2.x

The bug
-------
`google.genai._transformers.process_schema` inlines $ref references recursively
without a cycle guard. A schema with *mutual* recursion — where type A references
type B which references type A — causes infinite recursion and a Python stack overflow.

This is the pattern produced by pydantic's model_json_schema() for models like:

    class Expr(RootModel[Union[Node, Leaf]]): ...   # Expr refs Node
    class Node(BaseModel):
        args: List[Expr]                             # Node refs Expr → Node → Expr → ...

Workaround
----------
Add a `__get_pydantic_json_schema__` classmethod to break the cycle in the
*serialized* schema before `process_schema` ever sees it. Pydantic runtime
validation is unaffected since it uses its own core schema, not the JSON schema.

Reproduction
------------
    pip install google-genai pydantic
    python mwe_google_genai_recursion.py
"""

import copy
import os
from typing import List, Literal, Union
from pydantic import BaseModel, RootModel
import google.genai as genai
import google.genai._transformers as transformers
from google.genai.types import GenerateContentConfig

client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])
MODEL = "gemini-3-flash-preview"
PROMPT = "Generate a minimal example: a multiplication of 2 * 3, where each operand is a leaf."


# ---------------------------------------------------------------------------
# Mutually recursive schema: Expr → Node → Expr → ...
# ---------------------------------------------------------------------------

class Leaf(BaseModel):
    type: Literal["leaf"]
    value: float


class Node(BaseModel):
    """Operator node — mutual recursion with Expr, no workaround."""
    type: Literal["add", "mul"]
    args: List["Expr"]


class Expr(RootModel[Union[Node, Leaf]]):
    pass


Node.model_rebuild()
Expr.model_rebuild()


class NodeFixed(BaseModel):
    """Operator node — mutual recursion with Expr, workaround applied."""
    type: Literal["add", "mul"]
    args: List["ExprFixed"]

    @classmethod
    def __get_pydantic_json_schema__(cls, core_schema, handler):
        schema = handler(core_schema)
        # Replace the recursive $ref in args.items with {} (any) to break the cycle.
        schema.get("properties", {}).get("args", {})["items"] = {}
        return schema


class ExprFixed(RootModel[Union[NodeFixed, Leaf]]):
    pass


NodeFixed.model_rebuild()
ExprFixed.model_rebuild()


# ---------------------------------------------------------------------------
# 1. Reproduce the bug: process_schema overflows on mutual $ref cycle
# ---------------------------------------------------------------------------

print("=== process_schema on raw schema (bug) ===")
try:
    schema_buggy = Expr.model_json_schema()
    # process_schema mutates in-place; copy to avoid tainting the schema object.
    transformers.process_schema(copy.deepcopy(schema_buggy), client=None)
    print("No error — bug may be fixed in this version.")
except RecursionError:
    print("RecursionError: maximum recursion depth exceeded  ✓ (bug reproduced)")

# ---------------------------------------------------------------------------
# 2. Workaround: __get_pydantic_json_schema__ breaks the cycle at schema
#    generation time; process_schema and the API call then succeed.
# ---------------------------------------------------------------------------

print()
print("=== API call with response_json_schema — fixed schema (workaround) ===")
schema_fixed = ExprFixed.model_json_schema()
try:
    transformers.process_schema(schema_fixed, client=None)  # verify no overflow before sending
    response = client.models.generate_content(
        model=MODEL,
        contents=PROMPT,
        config=GenerateContentConfig(
            response_mime_type="application/json",
            response_json_schema=schema_fixed,
        ),
    )
    print(f"SUCCESS  ✓  Response: {response.text[:200]}")
except RecursionError:
    print("RecursionError: workaround did not help.")
except Exception as e:
    print(f"API error: {e}")

# ---------------------------------------------------------------------------
# 3. response_schema=pydantic_class also calls process_schema internally,
#    so it also overflows for mutually recursive types without the fix.
#    Using the fixed class (with __get_pydantic_json_schema__ override) works.
# ---------------------------------------------------------------------------

print()
print("=== API call with response_schema=Expr (also overflows) ===")
try:
    response = client.models.generate_content(
        model=MODEL,
        contents=PROMPT,
        config=GenerateContentConfig(
            response_mime_type="application/json",
            response_schema=Expr,
        ),
    )
    print(f"No error. Response: {response.text[:200]}")
except RecursionError:
    print("RecursionError: maximum recursion depth exceeded  ✓ (same bug, different path)")
except Exception as e:
    print(f"API error: {e}")

print()
print("=== API call with response_schema=ExprFixed (workaround applies here too) ===")
try:
    response = client.models.generate_content(
        model=MODEL,
        contents=PROMPT,
        config=GenerateContentConfig(
            response_mime_type="application/json",
            response_schema=ExprFixed,
        ),
    )
    print(f"SUCCESS  ✓  Response: {response.text[:200]}")
except RecursionError:
    print("RecursionError: unexpected.")
except Exception as e:
    print(f"API error: {e}")

Metadata

Metadata

Labels

priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions