This repository demonstrates how to use LangChain's with_structured_output function to generate structured responses from language models.
It explores three approaches for defining and parsing the output schema:
- TypeDict
- Pydantic
- JSON Schema
The with_structured_output method in LangChain allows you to specify the expected structure of an LLM's output.
This ensures that the model's responses are predictable, typed, and machine-readable — a must for production-grade AI applications.
Description:
TypeDict is a typed dictionary definition that works directly with Python's TypedDict from the typing module. It defines field names and types for structured responses.
Advantages:
- Simple, lightweight, and built into Python.
- Easy to read and maintain.
- Minimal dependencies.
Use Cases:
- Quick prototypes.
- Internal tools where strict runtime validation is not essential.
- Small schemas with clear field definitions.
Description:
Pydantic models provide robust runtime type checking and validation.
They ensure the model’s output not only follows the schema but also validates the data.
Advantages:
- Strong runtime validation.
- Automatic documentation and serialization.
- Detailed error messages.
- Rich support for complex nested objects.
Use Cases:
- Production APIs where input/output integrity is critical.
- Applications requiring advanced validation (e.g., date formats, enums).
- Integration with FastAPI or other frameworks.
Description:
JSON Schema defines the structure, data types, and validation rules using a standardized, language-agnostic format.
Advantages:
- Language-independent — works across different programming environments.
- Flexible for dynamic schema generation.
- Widely adopted and standardized.
Use Cases:
- Cross-platform applications.
- Scenarios where schema definition must be shared between different languages or services.
- LLM applications consumed by non-Python environments.