Skip to content

Latest commit

 

History

History
218 lines (161 loc) · 9.65 KB

File metadata and controls

218 lines (161 loc) · 9.65 KB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Text2SQL is a natural language to SQL query system with visualization capabilities and voice assistant features. It translates natural language questions into SQL, executes queries against SQLite databases, and provides interactive visualizations and voice-based interactions. The system uses either Azure OpenAI or Anthropic Claude for LLM capabilities, Qdrant for vector storage, and Groq for voice features.

Architecture

Backend Architecture (Flask + Python)

The backend is a Flask application (app.py) serving both API endpoints and the React frontend build. The core system follows a layered architecture:

Layer 1: Talk2SQL Engine (talk2sql/)

  • base.py: Abstract base class defining the core Talk2SQL interface (SQL generation, retry logic, visualization)
  • engine/Talk2SQL_azure.py: Azure OpenAI implementation with query history tracking in SQLite
  • engine/Talk2SQL_anthropic.py: Anthropic Claude implementation with similar capabilities
  • Both implementations use multiple inheritance to combine LLM capabilities with vector storage

Layer 2: LLM Integration (talk2sql/llm/)

  • azure_openai.py: Azure OpenAI client wrapper for completions and embeddings
  • anthropic.py: Anthropic Claude client wrapper for completions

Layer 3: Vector Store (talk2sql/vector_store/)

  • qdrant.py: Qdrant vector database integration for storing question-SQL pairs, schema, and documentation
  • Uses three collections per database: questions, schema, and docs
  • Embeddings are generated via Azure OpenAI (text-embedding-ada-002)

Key Components:

  • RAG System: Uses vector similarity search to find relevant question-SQL examples, schema info, and documentation
  • Retry Mechanism: Automatically retries failed SQL queries by generating corrected versions using error context (up to 3 attempts by default)
  • Query History: Stores all queries, errors, retries, timing, and results in SQLite (databases/query_history.sqlite)
  • Memory Tracking: Tracks whether RAG context (similar questions/schema) was used in SQL generation

Frontend Architecture (React + TypeScript + Vite)

Built with React 18, TypeScript, and Tailwind CSS, using Vite for build tooling.

Main Components:

  • App.tsx: Main application with tabbed interface (Query, Database, History, Metrics, Voice)
  • HistoryPage.tsx: Query history browser with export functionality
  • MetricsPage.tsx: Analytics dashboard with success rates, latency metrics, and error analysis
  • PlotlyChart.tsx: Plotly.js visualization wrapper
  • AudioPlayer.tsx: Voice playback controls

Key Features:

  • Real-time streaming responses via Server-Sent Events
  • Interactive Plotly visualizations auto-generated from query results
  • Voice assistant with transcription and text-to-speech
  • Query history with full result preservation and re-execution

Common Development Commands

Backend Setup and Running

# Install Python dependencies
pip install -r requirements.txt

# Run the Flask development server
python app.py

# The app runs on port 8080 by default
# Access at: http://localhost:8080

Frontend Development

cd frontend

# Install dependencies
npm install

# Development mode (for frontend-only changes)
npm run dev

# Build for production (required before running Flask app)
npm run build

# The build outputs to frontend/dist/ which Flask serves automatically

Environment Configuration

Create a .env file in the project root with:

# Azure OpenAI (required for embeddings and main LLM)
AZURE_OPENAI_API_KEY=your_key
AZURE_ENDPOINT=your_endpoint
AZURE_API_VERSION=2024-02-15-preview
AZURE_DEPLOYMENT=gpt-4o-mini

# Anthropic (optional alternative to Azure OpenAI for completions)
ANTHROPIC_API_KEY=your_key

# Qdrant Vector Database (optional for persistent storage)
QDRANT_URL=your_qdrant_url
QDRANT_API_KEY=your_qdrant_api_key

# Groq API (required for voice features)
GROQ_API_KEY=your_key

Docker Deployment

# Build and run with Docker
docker build -t text2sql .
docker run -p 8080:8080 text2sql

# Or deploy to fly.io
fly deploy

Code Architecture Details

SQL Generation Flow

  1. User submits natural language question via /ask endpoint
  2. System generates embeddings for the question (Azure OpenAI)
  3. Vector search retrieves similar questions, relevant schema, and docs from Qdrant
  4. LLM (Azure OpenAI or Anthropic) generates SQL using RAG context
  5. SQL is extracted from LLM response (handles <sql> tags, markdown, etc.)
  6. System attempts to execute SQL against connected SQLite database
  7. If execution fails, error is passed back to LLM to generate corrected SQL
  8. Retry loop continues up to max_retry_attempts (default: 3)
  9. On success: results are returned, visualization code is generated, summary is created
  10. All data (query, SQL, results, timing, errors) is stored in query history database

Vector Store Organization

Per-database collections in Qdrant:

  • {db_name}_questions: Question-SQL pairs from training data and user feedback
  • {db_name}_schema: Database schema (DDL) extracted via PRAGMA table_info
  • {db_name}_docs: Additional documentation strings

Visualization Generation

  1. After successful query, system checks if visualization is appropriate (should_generate_visualization)
  2. LLM generates Python/Plotly code based on the question, SQL, and DataFrame metadata
  3. Code is executed in isolated namespace with df, go, and pd available
  4. Resulting fig object is serialized and sent to frontend
  5. Frontend renders using react-plotly.js

Voice Assistant Flow

  1. Audio uploaded via /upload_audio or recorded via /record_audio
  2. Groq Whisper transcribes audio to text
  3. Transcribed question follows normal SQL generation flow
  4. Results summary is converted to speech via Groq TTS (Play3.0 model)
  5. Audio response returned as base64-encoded WAV

Key Files and Their Roles

  • app.py: Main Flask application (~3500 lines) with all API endpoints, database management, and frontend serving
  • talk2sql/base.py: Abstract base class defining core SQL generation, retry, and visualization interfaces
  • talk2sql/engine/Talk2SQL_azure.py: Full Azure OpenAI implementation with query history persistence
  • talk2sql/engine/Talk2SQL_anthropic.py: Anthropic implementation (uses Azure OpenAI for embeddings)
  • talk2sql/vector_store/qdrant.py: Vector database operations (upsert, search, collection management)
  • databases/: Directory containing SQLite databases (nba.sqlite, fifa.sqlite, pokemon.sqlite, query_history.sqlite)
  • training_data/: JSON files with example question-SQL pairs for training
  • frontend/src/App.tsx: Main React component orchestrating all UI

Database Schema Extraction

When connecting to a database, the system automatically:

  1. Runs SELECT name FROM sqlite_master WHERE type='table' to get all tables
  2. For each table, runs PRAGMA table_info({table_name}) to get column details
  3. Constructs DDL-like schema strings and stores in Qdrant schema collection
  4. Loads training examples from training_data/{db_name}_*.json files if they exist

Query History Schema

The query_history.sqlite database tracks:

  • id, timestamp, question, sql, success, error_message, retry_count
  • data (JSON serialized DataFrame), columns, summary
  • visualization (pickled Plotly figure)
  • total_time_ms, sql_generation_time_ms, sql_execution_time_ms, visualization_time_ms, explanation_time_ms
  • timing_details (JSON with breakdown)
  • used_memory (boolean indicating if RAG context was used)

Testing and Debugging

  • Set debug_mode: True in the Talk2SQL config to enable verbose logging
  • Check Flask console for detailed error traces and LLM prompts
  • Use /metrics endpoint to analyze success rates, latency, and error patterns
  • Use /analyze endpoint to identify common error types and query patterns
  • Query history is preserved in databases/query_history.sqlite for inspection

Important Implementation Notes

  • Thread Safety: The app uses threading for concurrent requests; each Talk2SQL instance should be thread-safe
  • Memory Usage: Query history stores full DataFrames and visualizations; consider periodic cleanup for production
  • SQL Extraction: Multiple regex patterns handle various LLM response formats (XML tags, markdown, raw SQL)
  • Retry Logic: Error messages are passed to LLM for correction; this enables learning from syntax errors
  • Config Metadata: SQL generation includes <config> section indicating if RAG context was used
  • Embeddings: Always use Azure OpenAI for embeddings (1536 dimensions) even when using Anthropic for completions
  • Voice Encoding: Audio is base64-encoded for JSON transport; decode on client before playback
  • Frontend Build: Flask serves frontend/dist/index.html for root route and frontend/dist/assets/ for static files

API Endpoint Categories

Full API documentation is in api.md. Key endpoint categories:

  • Database Management: /connect, /databases, /upload_database, /vector_store_status
  • Query Operations: /ask, /ask_stream, /follow_up_questions, /history, /metrics
  • Training & Learning: /upload_training_data, /feedback, /cleanup_duplicates
  • Voice Features: /record_audio, /upload_audio, /transcribe, /text_to_speech, /voice_assistant

Deployment Notes

  • Production deployment uses Gunicorn as WSGI server (see Dockerfile)
  • Fly.io configuration in fly.toml
  • Ensure .env variables are set in production environment
  • Frontend must be built (npm run build) before deploying
  • Database files should be persisted in mounted volumes