Skip to content

[Suggestion] LangGraph multi-step voice agent with Deepgram STT and TTS (Python) #248

@deepgram-robot

Description

@deepgram-robot

What to build

A Python example demonstrating a LangGraph stateful agent that uses Deepgram STT for voice input and Deepgram TTS for voice output, with multi-step tool calling and conditional branching between agent nodes.

Why this matters

LangGraph is the fastest-growing agentic AI framework, enabling developers to build stateful, multi-step agent workflows with cycles and conditional logic. Developers building voice-enabled AI agents need a reference implementation showing how to wire Deepgram's streaming STT and TTS into LangGraph's graph-based execution model. This is distinct from the existing LangChain example — LangGraph handles stateful agent loops, not linear chains.

Suggested scope

  • Language: Python
  • Framework: LangGraph (langgraph >= 0.2)
  • Deepgram APIs: Streaming STT (Nova-3), TTS (Aura), Audio Intelligence (optional sentiment)
  • Pattern: Microphone → Deepgram STT → LangGraph agent graph (with tool nodes) → Deepgram TTS → speaker
  • Complexity: Medium — requires LangGraph graph definition, tool node implementation, and audio I/O
  • Backend only (CLI-based), no frontend required

Acceptance criteria

  • Runnable with minimal setup (clone, add API key, run)
  • README explains the pattern clearly
  • Uses current SDK version
  • Demonstrates at least 2 tool nodes in the LangGraph graph (e.g., weather lookup + calendar check)
  • Shows conditional branching based on STT transcript content
  • Audio streams through Deepgram STT and TTS (not batch)

Raised by the DX intelligence system.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions