This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Text2SQL is a natural language to SQL query system with visualization capabilities and voice assistant features. It translates natural language questions into SQL, executes queries against SQLite databases, and provides interactive visualizations and voice-based interactions. The system uses either Azure OpenAI or Anthropic Claude for LLM capabilities, Qdrant for vector storage, and Groq for voice features.
The backend is a Flask application (app.py) serving both API endpoints and the React frontend build. The core system follows a layered architecture:
Layer 1: Talk2SQL Engine (talk2sql/)
base.py: Abstract base class defining the core Talk2SQL interface (SQL generation, retry logic, visualization)engine/Talk2SQL_azure.py: Azure OpenAI implementation with query history tracking in SQLiteengine/Talk2SQL_anthropic.py: Anthropic Claude implementation with similar capabilities- Both implementations use multiple inheritance to combine LLM capabilities with vector storage
Layer 2: LLM Integration (talk2sql/llm/)
azure_openai.py: Azure OpenAI client wrapper for completions and embeddingsanthropic.py: Anthropic Claude client wrapper for completions
Layer 3: Vector Store (talk2sql/vector_store/)
qdrant.py: Qdrant vector database integration for storing question-SQL pairs, schema, and documentation- Uses three collections per database: questions, schema, and docs
- Embeddings are generated via Azure OpenAI (text-embedding-ada-002)
Key Components:
- RAG System: Uses vector similarity search to find relevant question-SQL examples, schema info, and documentation
- Retry Mechanism: Automatically retries failed SQL queries by generating corrected versions using error context (up to 3 attempts by default)
- Query History: Stores all queries, errors, retries, timing, and results in SQLite (
databases/query_history.sqlite) - Memory Tracking: Tracks whether RAG context (similar questions/schema) was used in SQL generation
Built with React 18, TypeScript, and Tailwind CSS, using Vite for build tooling.
Main Components:
App.tsx: Main application with tabbed interface (Query, Database, History, Metrics, Voice)HistoryPage.tsx: Query history browser with export functionalityMetricsPage.tsx: Analytics dashboard with success rates, latency metrics, and error analysisPlotlyChart.tsx: Plotly.js visualization wrapperAudioPlayer.tsx: Voice playback controls
Key Features:
- Real-time streaming responses via Server-Sent Events
- Interactive Plotly visualizations auto-generated from query results
- Voice assistant with transcription and text-to-speech
- Query history with full result preservation and re-execution
# Install Python dependencies
pip install -r requirements.txt
# Run the Flask development server
python app.py
# The app runs on port 8080 by default
# Access at: http://localhost:8080cd frontend
# Install dependencies
npm install
# Development mode (for frontend-only changes)
npm run dev
# Build for production (required before running Flask app)
npm run build
# The build outputs to frontend/dist/ which Flask serves automaticallyCreate a .env file in the project root with:
# Azure OpenAI (required for embeddings and main LLM)
AZURE_OPENAI_API_KEY=your_key
AZURE_ENDPOINT=your_endpoint
AZURE_API_VERSION=2024-02-15-preview
AZURE_DEPLOYMENT=gpt-4o-mini
# Anthropic (optional alternative to Azure OpenAI for completions)
ANTHROPIC_API_KEY=your_key
# Qdrant Vector Database (optional for persistent storage)
QDRANT_URL=your_qdrant_url
QDRANT_API_KEY=your_qdrant_api_key
# Groq API (required for voice features)
GROQ_API_KEY=your_key# Build and run with Docker
docker build -t text2sql .
docker run -p 8080:8080 text2sql
# Or deploy to fly.io
fly deploy- User submits natural language question via
/askendpoint - System generates embeddings for the question (Azure OpenAI)
- Vector search retrieves similar questions, relevant schema, and docs from Qdrant
- LLM (Azure OpenAI or Anthropic) generates SQL using RAG context
- SQL is extracted from LLM response (handles
<sql>tags, markdown, etc.) - System attempts to execute SQL against connected SQLite database
- If execution fails, error is passed back to LLM to generate corrected SQL
- Retry loop continues up to
max_retry_attempts(default: 3) - On success: results are returned, visualization code is generated, summary is created
- All data (query, SQL, results, timing, errors) is stored in query history database
Per-database collections in Qdrant:
{db_name}_questions: Question-SQL pairs from training data and user feedback{db_name}_schema: Database schema (DDL) extracted viaPRAGMA table_info{db_name}_docs: Additional documentation strings
- After successful query, system checks if visualization is appropriate (
should_generate_visualization) - LLM generates Python/Plotly code based on the question, SQL, and DataFrame metadata
- Code is executed in isolated namespace with
df,go, andpdavailable - Resulting
figobject is serialized and sent to frontend - Frontend renders using react-plotly.js
- Audio uploaded via
/upload_audioor recorded via/record_audio - Groq Whisper transcribes audio to text
- Transcribed question follows normal SQL generation flow
- Results summary is converted to speech via Groq TTS (Play3.0 model)
- Audio response returned as base64-encoded WAV
app.py: Main Flask application (~3500 lines) with all API endpoints, database management, and frontend servingtalk2sql/base.py: Abstract base class defining core SQL generation, retry, and visualization interfacestalk2sql/engine/Talk2SQL_azure.py: Full Azure OpenAI implementation with query history persistencetalk2sql/engine/Talk2SQL_anthropic.py: Anthropic implementation (uses Azure OpenAI for embeddings)talk2sql/vector_store/qdrant.py: Vector database operations (upsert, search, collection management)databases/: Directory containing SQLite databases (nba.sqlite, fifa.sqlite, pokemon.sqlite, query_history.sqlite)training_data/: JSON files with example question-SQL pairs for trainingfrontend/src/App.tsx: Main React component orchestrating all UI
When connecting to a database, the system automatically:
- Runs
SELECT name FROM sqlite_master WHERE type='table'to get all tables - For each table, runs
PRAGMA table_info({table_name})to get column details - Constructs DDL-like schema strings and stores in Qdrant schema collection
- Loads training examples from
training_data/{db_name}_*.jsonfiles if they exist
The query_history.sqlite database tracks:
id,timestamp,question,sql,success,error_message,retry_countdata(JSON serialized DataFrame),columns,summaryvisualization(pickled Plotly figure)total_time_ms,sql_generation_time_ms,sql_execution_time_ms,visualization_time_ms,explanation_time_mstiming_details(JSON with breakdown)used_memory(boolean indicating if RAG context was used)
- Set
debug_mode: Truein the Talk2SQL config to enable verbose logging - Check Flask console for detailed error traces and LLM prompts
- Use
/metricsendpoint to analyze success rates, latency, and error patterns - Use
/analyzeendpoint to identify common error types and query patterns - Query history is preserved in
databases/query_history.sqlitefor inspection
- Thread Safety: The app uses threading for concurrent requests; each Talk2SQL instance should be thread-safe
- Memory Usage: Query history stores full DataFrames and visualizations; consider periodic cleanup for production
- SQL Extraction: Multiple regex patterns handle various LLM response formats (XML tags, markdown, raw SQL)
- Retry Logic: Error messages are passed to LLM for correction; this enables learning from syntax errors
- Config Metadata: SQL generation includes
<config>section indicating if RAG context was used - Embeddings: Always use Azure OpenAI for embeddings (1536 dimensions) even when using Anthropic for completions
- Voice Encoding: Audio is base64-encoded for JSON transport; decode on client before playback
- Frontend Build: Flask serves
frontend/dist/index.htmlfor root route andfrontend/dist/assets/for static files
Full API documentation is in api.md. Key endpoint categories:
- Database Management:
/connect,/databases,/upload_database,/vector_store_status - Query Operations:
/ask,/ask_stream,/follow_up_questions,/history,/metrics - Training & Learning:
/upload_training_data,/feedback,/cleanup_duplicates - Voice Features:
/record_audio,/upload_audio,/transcribe,/text_to_speech,/voice_assistant
- Production deployment uses Gunicorn as WSGI server (see
Dockerfile) - Fly.io configuration in
fly.toml - Ensure
.envvariables are set in production environment - Frontend must be built (
npm run build) before deploying - Database files should be persisted in mounted volumes