Skip to content

netguru/chatguru

Repository files navigation

chatguru AI Agent

Python 3.12+ FastAPI License Langfuse

chatguru Agent is a production-ready whitelabel chatbot with RAG capabilities and agentic commerce integration, built with FastAPI, LangChain, and Azure OpenAI.


Brought with  ❤️ by   Netguru

Documentation

Read the full Docs at: https://github.com/netguru/chatguru

Preview

chatguru Agent ships with WebSocket streaming, RAG capabilities, and comprehensive observability!

Key Features:

  • Real-time WebSocket streaming for instant responses
  • RAG-powered product search and recommendations
  • Comprehensive API documentation with Swagger UI

Installation

Installation & requirements

Install latest library version

ℹ️ Library supports Python 3.12+

Install library's dependencies

# Clone the repository
git clone <repository-url>
cd chatguru

# Complete development setup
make setup

After installation:

# Configure environment variables
make env-setup
# Edit .env with your credentials

# Start the development server
make dev

In Use

Check the live demo at http://localhost:8000/

This is how you can use the WebSocket API in your app:

import asyncio
import websockets
import json

async def chat():
    uri = "ws://localhost:8000/ws"
    async with websockets.connect(uri) as websocket:
        # Send message
        await websocket.send(json.dumps({
            "message": "Hello, how are you?",
            "session_id": None
        }))

        # Receive streaming response
        async for message in websocket:
            data = json.loads(message)
            if data["type"] == "token":
                print(data["content"], end="", flush=True)
            elif data["type"] == "end":
                print("\n")
                break
            elif data["type"] == "error":
                print(f"Error: {data['content']}")
                break

asyncio.run(chat())

✨ Features

  • 🚀 WebSocket Streaming: Real-time streaming chat responses via WebSocket
  • 🧪 Minimal Test UI: Lightweight HTML at / for smoke testing only
  • 🎨 Whitelabel Design: Easily customizable for different brands and tenants
  • 🧠 RAG Capabilities: Semantic product search with sqlite-vec vector database
  • 🛒 Agentic Commerce: Ready for MCP (Model Context Protocol) integration
  • 📊 Observability: Built-in Langfuse tracing and monitoring
  • ✅ Testing: Comprehensive test suite with promptfoo LLM evaluation
  • 🐳 Production Ready: Docker containerization with health checks

🏗️ Architecture

Simple, modular architecture designed for whitelabel deployment:

graph LR
    subgraph "Current Implementation"
        UI[React/Vite Frontend<br/>frontend/] -->|WebSocket| API[FastAPI API]
        API -->|Streaming| AGENT[Agent Service]
        AGENT -->|AzureChatOpenAI| LLM[Azure OpenAI]
        AGENT -->|RAG Tool| PRODUCTDB[Product DB<br/>sqlite-vec]
        AGENT --> LANGFUSE[Langfuse<br/>Tracing]
    end

    subgraph "Future Extensions"
        MCP[MCP Tools<br/>Commerce Platforms]
        AGENT -.-> MCP
    end
Loading

For detailed architecture documentation, see docs/architecture.md.

🛠️ Technology Stack

  • Backend: FastAPI + Uvicorn (async)
  • AI/ML: LangChain + Azure OpenAI (direct integration)
  • LLM Provider: Azure OpenAI (via langchain-openai)
  • Vector Search: sqlite-vec (semantic product search)
  • Observability: Langfuse
  • Testing: pytest + promptfoo + GenericFakeChatModel
  • Code Quality: mypy + ruff + pre-commit
  • Frontend: React 19 + Vite (frontend/)
  • CSS: Tailwind CSS v4 (via @tailwindcss/vite)
  • Containerization: Docker + Docker Compose
  • Package Management: uv (Python) + npm (Node.js)
  • Development: Makefile for task automation

🌐 Frontend

A React + Vite frontend lives in the frontend/ directory.

Run it locally:

make frontend-dev   # Vite dev server → http://localhost:5173

Or via Docker Compose — the frontend service starts automatically on port 5173.

Copy the env template before running:

cp frontend/.env.example frontend/.env

📋 Prerequisites

Before you begin, ensure you have the following installed:

  • Python 3.12+ (Download)
  • Node.js 20+ and npm — required by React 19 (Download)
  • uv - Fast Python package installer (Installation guide)
  • Docker and Docker Compose (optional, for containerized deployment)
  • Azure OpenAI account with API access
  • Langfuse account (for observability and tracing)

🚀 Quick Start

Option 1: Local Development (Recommended for Development)

1. Clone the Repository

git clone <repository-url>
cd chatguru

2. Complete Development Setup

# Install dependencies and set up pre-commit hooks
make setup

This command will:

  • Install Python dependencies using uv
  • Install and configure pre-commit hooks
  • Set up the development environment

3. Configure Environment Variables

# Copy environment template
make env-setup

# Edit .env with your credentials
# Required: LLM_* and LANGFUSE_* variables (see Configuration section below)

4. Start the Development Server

make dev

5. Access the Application

Option 2: Docker Deployment (Recommended for Production)

1. Clone and Configure

git clone <repository-url>
cd chatguru

# Copy and configure environment variables
make env-setup
# Edit .env with your credentials

2. Build and Run

# Build and start all services
make docker-run

# Or run in background
make docker-run-detached

3. Access the Application

🔧 Configuration

The application uses environment variables for configuration. Copy env.example to .env and configure the following:

Required Environment Variables

Variable Description Example
OPENAI_ENDPOINT OpenAI-compatible base URL for chat + embeddings https://your-resource.openai.azure.com/openai/v1
LLM_API_KEY Azure OpenAI API key your-api-key-here
LLM_DEPLOYMENT_NAME Azure OpenAI deployment name gpt-4o-mini
LANGFUSE_PUBLIC_KEY Langfuse public key pk-lf-...
LANGFUSE_SECRET_KEY Langfuse secret key sk-lf-...
LANGFUSE_HOST Langfuse host URL https://cloud.langfuse.com

Optional Environment Variables

Variable Description Default
FASTAPI_HOST API host address 0.0.0.0
FASTAPI_PORT API port 8000
FASTAPI_CORS_ORIGINS CORS allowed origins (JSON array) ["*"]
APP_NAME Application name chatguru Agent
DEBUG Enable debug mode false
LOG_LEVEL Logging level INFO
VECTOR_DB_TYPE Database type sqlite
VECTOR_DB_SQLITE_URL SQLite service URL http://product-db:8001
PERSISTENCE_DATABASE_URL Async SQLAlchemy URL for chat history storage (unset — disabled)
LLM_API_VERSION API version for native Azure OpenAI setups (empty)
LLM_OPENAI_BASE_URL OpenAI v1-compatible chat base URL; when set, chat uses ChatOpenAI instead of native Azure routing (empty)
TITLE_GENERATION_PROVIDER Title provider: openai, fallback, custom openai
TITLE_GENERATION_CUSTOM_CLASS Custom class path (module.path:ClassName) when provider is custom (empty)

Chat history persistence

PERSISTENCE_DATABASE_URL is the single toggle for server-side chat history:

  • Unset (default) — persistence is disabled. The server is stateless: no database is required and no messages are stored. The /history and /conversations endpoints are not registered at all (they won't appear in /docs or return 404).
  • Set — persistence is enabled. Messages and conversations are stored per visitor_id / session_id. Run make migrate once after setting the URL to create the schema.
# SQLite (local dev / single-node)
PERSISTENCE_DATABASE_URL=sqlite+aiosqlite:///data/chatguru.db

# PostgreSQL
PERSISTENCE_DATABASE_URL=postgresql+asyncpg://user:pass@localhost:5432/chatguru

See docs/persistence.md for the full architecture and instructions on adding new database adapters.

LLM URL modes: LLM_OPENAI_BASE_URL (universal OpenAI-compatible API) vs. OPENAI_ENDPOINT with empty LLM_OPENAI_BASE_URL (native Azure OpenAI client) is documented in docs/design-decisions.md.

See env.example for a complete template with detailed comments.

📡 API Documentation

WebSocket API

The primary interface for chat is via WebSocket at ws://localhost:8000/ws.

Request Format

{
  "message": "Your message here",
  "session_id": "optional-session-id",
  "messages": [
    {"role": "user", "content": "previous user message"},
    {"role": "assistant", "content": "previous assistant response"}
  ]
}

Response Format

Responses are streamed as JSON messages:

// Token chunk (streamed multiple times)
{"type": "token", "content": "chunk of text", "session_id": "session-id"}

// End of stream (includes the full response as safety)
{"type": "end", "content": "full assistant response", "session_id": "session-id"}

// Error response
{"type": "error", "content": "error message", "session_id": "session-id"}

REST API

  • Health Check: GET /health
  • API Documentation: GET /docs (Swagger UI)
  • OpenAPI Schema: GET /openapi.json

The following endpoints are only registered when PERSISTENCE_DATABASE_URL is set:

  • GET /history — returns stored messages for a visitor_id + session_id pair, oldest first.
    • Query params: visitor_id (required), session_id (default: "default")
  • GET /conversations — returns all conversations for a visitor_id, newest first.
    • Query params: visitor_id (required)

🛠️ Development

Available Commands

Run make help to see all available commands. Key commands:

Installation & Setup

make setup          # Complete development setup
make env-setup      # Copy environment template
make install        # Install production dependencies

Development Servers

make dev            # Start backend development server (auto-reload)
make frontend-dev   # Start frontend development server (Vite, port 5173)
make run            # Start production server (no auto-reload)

Testing

make test           # Run all tests
make coverage       # Run tests with coverage report
make promptfoo-eval # Run LLM evaluation tests
make promptfoo-view # View evaluation results

Code Quality

make pre-commit-install  # Install pre-commit hooks
make pre-commit          # Run pre-commit checks manually

Docker

make docker-build        # Build Docker images
make docker-run          # Run with Docker Compose (foreground)
make docker-run-detached # Run with Docker Compose (background)
make docker-stop         # Stop services
make docker-down         # Stop and remove containers
make docker-logs         # View logs
make docker-clean        # Clean all Docker resources

Utilities

make version        # Show current version
make clean          # Clean Python cache files

Project Structure

chatguru/
├── frontend/                # React + Vite frontend
│   ├── src/                 # Source code (components, hooks, pages)
│   ├── public/              # Static assets
│   ├── .env.example         # Frontend env template
│   └── package.json
├── src/                     # Main application code
│   ├── api/                 # FastAPI application
│   │   ├── main.py         # FastAPI app setup
│   │   ├── templates/      # Minimal HTML test UI
│   │   └── routes/         # API routes
│   │       └── chat.py     # WebSocket chat endpoint
│   ├── agent/              # Agent implementation
│   │   ├── service.py      # LangChain agent with streaming
│   │   ├── prompt.py       # System prompts
│   │   └── __init__.py
│   ├── product_db/          # Product database (sqlite-vec)
│   │   ├── api.py          # FastAPI service
│   │   ├── store.py        # ProductStore with embeddings
│   │   ├── sqlite.py       # HTTP client for agent
│   │   ├── base.py         # Abstract interface
│   │   └── factory.py      # Database factory
│   ├── rag/                # RAG components
│   │   ├── documents.py    # Document handling
│   │   ├── simple_retriever.py  # Retriever interface
│   │   └── products.json   # Sample products data
│   ├── config.py           # Configuration management
│   └── main.py             # Application entry point
├── tests/                  # Test suite
│   ├── test_api.py         # API endpoint tests
│   ├── test_agent.py       # Agent tests
│   └── conftest.py         # Test configuration
├── docs/                   # Documentation
│   └── architecture.md      # Architecture documentation
├── promptfoo/              # LLM evaluation config
│   ├── provider.py         # Python provider adapter
│   └── promptfooconfig.yaml
├── docker/                 # Docker configuration
│   ├── Dockerfile          # Backend Dockerfile
│   └── Dockerfile.db       # Product database Dockerfile
├── .pre-commit-config.yaml # Pre-commit hooks
├── docker-compose.yml      # Docker Compose setup
├── Makefile                # Development commands
├── pyproject.toml          # Python project configuration
├── env.example             # Environment template
└── README.md               # This file

🧪 Testing

Unit Tests

# Run all tests
make test

# Run with coverage report
make coverage

Tests use GenericFakeChatModel from LangChain for reliable, deterministic testing without API calls.

LLM Evaluation with Promptfoo

# Run evaluation suite
make promptfoo-eval

# View results in browser
make promptfoo-view

# Run specific test file
make promptfoo-test TEST=tests/basic_greeting.yaml

Promptfoo tests evaluate response quality, helpfulness, and boundary conditions.

RAG Evaluation with RAGAS and RAG Evaluator

RAGAS (Retrieval-Augmented Generation Assessment) and RAG Evaluator are frameworks/tools for evaluating the performance of Retrieval-Augmented Generation (RAG) systems. They provide metrics to assess aspects like faithfulness, answer relevance, context precision, and retrieval quality in RAG pipelines.

For detailed information on RAG testing and evaluation using RAGAS and RAG Evaluator, see docs/rag_eval_readme.md.

🐳 Docker Deployment

Quick Start

# Build and run backend with Docker Compose
make docker-run

Manual Docker Commands

# Build backend image
docker build -f docker/Dockerfile -t chatguru-agent .

# Run backend container
docker run -p 8000:8000 --env-file .env chatguru-agent

Ports

  • Frontend: 5173 (host) → 5173 (container)
  • Backend API: 8000 (host) → 8000 (container)
  • Product DB: 8001 (host) → 8001 (container)
  • WebSocket: ws://localhost:8000/ws
  • Test UI: http://localhost:8000/ (minimal, not production)

Frontend Service

The frontend service is included in Docker Compose and starts automatically on port 5173. WS_PROXY_TARGET controls where Vite proxies WebSocket traffic inside the Docker network (default: http://chatguru-agent:8000).

🐛 Troubleshooting

Common Issues

1. "Module not found" errors

Solution: Ensure dependencies are installed:

make install

2. WebSocket connection fails

Solution:

  • Verify backend is running: curl http://localhost:8000/health
  • Check WebSocket endpoint: ws://localhost:8000/ws
  • Ensure CORS is configured correctly in .env

3. Azure OpenAI authentication errors

Solution:

  • Verify OPENAI_ENDPOINT is a full OpenAI-compatible base URL ending in /v1
  • Check LLM_API_KEY is correct
  • Ensure LLM_DEPLOYMENT_NAME matches your Azure deployment
  • If using native Azure OpenAI routing, verify LLM_API_VERSION is supported

4. Langfuse connection errors

Solution:

  • Verify Langfuse credentials in .env
  • Check LANGFUSE_HOST is correct (default: https://cloud.langfuse.com)
  • Ensure network connectivity to Langfuse

5. Docker build fails

Solution:

  • Ensure uv.lock file exists (run uv sync locally first)
  • Check Docker has sufficient resources
  • Verify all required files are present

6. Port already in use

Solution:

  • Backend (8000): Stop other services using port 8000 or change FASTAPI_PORT
  • Frontend: Configure your external frontend to target the correct backend host/port

Getting Help

📚 Documentation

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for:

  • Development setup instructions
  • Code style guidelines
  • Testing requirements
  • Pull request process
  • Issue reporting guidelines

🔮 Roadmap

  • Vector Database Integration: sqlite-vec for semantic search ✅
  • Streaming Responses: Real-time chat streaming via WebSocket ✅
  • MCP Tools: Integration with commerce platforms (PimCore, Strapi, Medusa.js)
  • Authentication: JWT-based API authentication
  • Rate Limiting: API rate limiting and quotas
  • Session Management: Client-side persistent conversation history (localStorage) ✅
  • Server-side Sessions: Backend-persisted conversation history via PERSISTENCE_DATABASE_URL (opt-in) ✅
  • Multi-tenancy: Database-backed tenant configuration

📄 License

This library is available as open source under the terms of the MIT License.

🙏 Acknowledgments

🆘 Support

For support and questions:


About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors