Create, version, and test LLM prompts visually β like Postman, but for prompts.
Prompt engineering is becoming a critical skill as LLMs become more prevalent. However, the tooling around prompt development is still in its infancy. PromptForge aims to fill this gap by providing:
- Visual Development: Build complex prompts without writing code
- Systematic Testing: Test prompts methodically across providers
- Performance Tracking: Measure and optimize prompt performance
- Version Control: Track prompt evolution over time
- Production Ready: Use prompts in your applications with confidence
Whether you're building AI applications, conducting research, or just experimenting with LLMs, PromptForge provides the tools you need to engineer better prompts.
PromptForge was created to solve the challenges developers and AI engineers face when working with Large Language Models (LLMs). Traditional prompt engineering involves:
- Manual testing - Copy-pasting prompts between different environments
- No version control - Difficult to track prompt iterations and improvements
- Limited testing - Hard to compare outputs across different LLM providers
- No metrics - No systematic way to measure prompt quality, latency, or cost
- Complex pipelines - Building multi-step prompt workflows is cumbersome
PromptForge aims to be the Postman for LLM prompts - providing a visual, intuitive interface for:
- π¨ Visual Prompt Building - Drag-and-drop interface to create complex prompt pipelines
- π Comprehensive Testing - Test prompts across multiple LLM providers (OpenAI, Anthropic, Mistral, Google Gemini)
- π Performance Metrics - Track latency, cost, quality, and similarity scores
- π Version Control - Track prompt iterations and compare different versions
- π Production Ready - Save and reuse prompts for your applications
PromptForge follows a microservices architecture with clear separation of concerns:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (Next.js) β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Dashboard β β Prompts β β Test Runs β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Visual Prompt Builder (React Flow) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β HTTP/REST API
ββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
β Node.js Backend (Express) β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Prompt CRUD β β Test Run API β β Auth/Session β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Queue Service (Celery Tasks) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β HTTP API
ββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
β Python Backend (FastAPI + Celery) β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β LLM Providersβ β Scoring β β Celery β β
β β (OpenAI, β β Engine β β Worker β β
β β Anthropic, β β β β β β
β β Mistral, β β β β β β
β β Gemini) β β β β β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββ΄βββββββββββββββ
β β
βββββββββ΄βββββββββ ββββββββββ΄βββββββββ
β PostgreSQL β β Redis β
β (Database) β β (Job Queue) β
ββββββββββββββββββ βββββββββββββββββββ
PromptForge/
βββ README.md # This file - comprehensive project documentation
βββ docker-compose.yml # Docker Compose configuration for all services
βββ Makefile # Development automation commands
βββ package.json # Root package.json for workspace management
βββ CONTRIBUTING.md # Contribution guidelines
βββ DUMMY_PROMPTS.md # Sample prompts for testing
βββ TEST_RUN_FLOW.md # Documentation of test run execution flow
Why these files exist:
- docker-compose.yml: Orchestrates all microservices (frontend, backends, database, Redis) in a single command. Enables consistent development environments across different machines.
- Makefile: Provides convenient shortcuts for common development tasks (docker commands, database setup, etc.). Makes onboarding easier.
- DUMMY_PROMPTS.md: Contains example prompts that developers can use to test the platform without creating their own from scratch.
The frontend is a Next.js 14 application using the App Router pattern.
frontend/
βββ app/ # Next.js App Router pages (file-based routing)
β βββ page.tsx # Landing page - welcomes users and shows features
β βββ layout.tsx # Root layout with sidebar and session provider
β βββ globals.css # Global styles and Tailwind CSS configuration
β βββ middleware.ts # Route protection - redirects unauthenticated users
β βββ api/ # API routes
β β βββ auth/
β β βββ [...nextauth]/ # NextAuth.js API route handler
β β βββ route.ts # Handles OAuth callbacks and session management
β βββ auth/
β β βββ signin/
β β βββ page.tsx # Sign-in page with OAuth providers (Google, GitHub)
β βββ dashboard/
β β βββ page.tsx # Dashboard with stats, charts, and analytics
β βββ prompts/
β β βββ page.tsx # Prompts list page - shows all user prompts
β β βββ [id]/
β β β βββ page.tsx # Individual prompt detail/edit page
β β βββ builder/
β β βββ page.tsx # Visual prompt builder page (React Flow)
β βββ tests/
β βββ page.tsx # Test runs list page - shows all test executions
βββ components/ # Reusable React components
β βββ layout/
β β βββ Sidebar.tsx # Main navigation sidebar (replaces navbar)
β β βββ ConditionalSidebar.tsx # Conditionally renders sidebar based on route
β β βββ Navbar.tsx # Legacy navbar (kept for reference)
β βββ prompt-builder/
β β βββ PromptBuilder.tsx # React Flow visual prompt builder component
β βββ prompt-create/
β β βββ CreatePromptDialog.tsx # Dialog for creating new prompts
β βββ prompt-list/
β β βββ PromptList.tsx # List view of all prompts with search/filter
β βββ test-run/
β β βββ TestRunDialog.tsx # Dialog for creating test runs
β β βββ TestRunDetails.tsx # Detailed view of test run results
β βββ dashboard/
β β βββ Charts.tsx # Recharts components for dashboard visualizations
β β βββ StatsCard.tsx # Reusable stat card component
β βββ providers/
β β βββ SessionProvider.tsx # Client-side wrapper for NextAuth SessionProvider
β βββ ui/ # Shadcn UI components (button, dialog, input, etc.)
βββ lib/
β βββ api.ts # API client - axios instance and API methods
β βββ auth.ts # NextAuth configuration and helpers
β βββ utils.ts # Utility functions (cn helper for Tailwind)
βββ types/
β βββ next-auth.d.ts # TypeScript type extensions for NextAuth
βββ package.json # Frontend dependencies
βββ next.config.js # Next.js configuration (transpiles recharts)
βββ tailwind.config.ts # Tailwind CSS configuration
βββ tsconfig.json # TypeScript configuration
Why this structure:
- App Router: Next.js 13+ App Router provides better performance, server components, and file-based routing.
- Component organization: Separated by feature (layout, prompt-builder, test-run) for better maintainability.
- lib/: Centralized API client and utilities reduce code duplication.
- types/: TypeScript definitions ensure type safety across the application.
The Node.js backend serves as the API gateway and handles business logic.
backend-node/
βββ src/
β βββ index.ts # Express server entry point
β β # - Sets up Express app, CORS, middleware
β β # - Defines API routes (prompts, test-runs)
β β # - Handles prompt CRUD operations
β β # - Creates test runs and queues them
β βββ services/
β βββ queueService.ts # Celery task queue service
β # - Sends test run tasks to Python backend
β # - Handles async job orchestration
βββ prisma/
β βββ schema.prisma # Prisma ORM schema definition
β # - User, Prompt, TestRun, PromptVersion models
β # - Database relationships and indexes
βββ Dockerfile # Production Docker image
βββ Dockerfile.dev # Development Docker image (with hot reload)
βββ package.json # Node.js dependencies
βββ tsconfig.json # TypeScript configuration
Why Node.js backend:
- Type safety: TypeScript ensures type safety between frontend and backend.
- Prisma ORM: Provides type-safe database access and migrations.
- Express: Fast, minimal, and well-suited for REST APIs.
- Separation of concerns: Node.js handles data persistence, Python handles LLM execution.
The Python backend handles LLM execution and evaluation.
backend-python/
βββ app/
β βββ __init__.py # Package initialization
β βββ celery_app.py # Celery application configuration
β β # - Defines async task: execute_prompt_task
β β # - Handles test run execution workflow
β βββ llm_providers.py # LLM provider implementations
β β # - OpenAIProvider: OpenAI API integration
β β # - AnthropicProvider: Claude API integration
β β # - MistralProvider: Mistral AI integration
β β # - GeminiProvider: Google Gemini integration
β β # - Prompt template variable substitution
β βββ scoring_engine.py # Evaluation metrics computation
β # - Similarity scoring (semantic similarity)
β # - Latency measurement
β # - Cost calculation
βββ main.py # FastAPI application entry point
β # - Health check endpoint
β # - API documentation
βββ requirements.txt # Python dependencies
βββ Dockerfile # Production Docker image
βββ Dockerfile.dev # Development Docker image
Why Python backend:
- LLM libraries: Python has the best ecosystem for LLM integrations (OpenAI, Anthropic, etc.).
- Celery: Industry-standard async task queue for Python.
- FastAPI: Modern, fast API framework with automatic OpenAPI documentation.
- ML libraries: Easy integration with ML libraries for scoring (sentence-transformers, scikit-learn).
shared/
βββ types/
βββ index.ts # Shared TypeScript types
# - Prompt interface
# - PromptContent interface
# - TestRun interface
# - Ensures type consistency across frontend/backend
Why shared types:
- Type safety: Ensures frontend and backend use the same data structures.
- Single source of truth: Changes to types are reflected everywhere.
- Reduces bugs: TypeScript catches mismatches at compile time.
βββ .gitignore # Git ignore patterns (node_modules, .env, etc.)
βββ docker-compose.yml # Service orchestration
βββ Makefile # Development automation
Start everything with a single command:
make docker-upOr manually:
docker-compose up -dThis will start:
- Frontend (http://localhost:3000)
- Node.js Backend API (http://localhost:4000)
- Python Backend API (http://localhost:8000)
- Celery Worker (background tasks)
- PostgreSQL (database) - Port 5433
- Redis (job queue)
make docker-down # Stop all services
make docker-logs # View logs
make docker-build # Rebuild images
make docker-restart # Restart servicesAfter starting services for the first time:
make setup-dbThis will:
- Generate Prisma client
- Run database migrations
- Create all necessary tables and enums
Create a .env file in the root directory (or set environment variables):
# Required
NEXTAUTH_SECRET=your-secret-key-here
OPENAI_API_KEY=sk-... # Or at least one LLM provider key
# Optional - LLM Provider Keys
ANTHROPIC_API_KEY=
MISTRAL_API_KEY=
GOOGLE_API_KEY=
# Optional - OAuth (for user authentication)
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=
# Database (usually handled by docker-compose)
DATABASE_URL=postgresql://promptforge:promptforge@postgres:5432/promptforgeFrontend specific (frontend/.env.local):
NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_SECRET=your-secret-key-here
NEXT_PUBLIC_API_URL=http://localhost:4000
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secretThe docker-compose.yml will automatically use these environment variables.
# Start all services
make docker-up
# View logs
make docker-logs
# Restart a specific service
docker-compose restart frontend
# Stop all services
make docker-downIf you prefer to run services locally:
# Install dependencies
make install
# Start services manually (4 terminals):
# Terminal 1: make dev-frontend
# Terminal 2: make dev-backend
# Terminal 3: cd backend-python && source venv/bin/activate && uvicorn main:app --reload
# Terminal 4: cd backend-python && source venv/bin/activate && celery -A app.celery_app worker --loglevel=infoSee all available commands:
make help- Create, read, update, delete prompts
- Version control for prompt iterations
- Search and filter prompts
- Template variable support (
{{variable}})
- Test prompts across multiple LLM providers
- Compare outputs from different models
- Async execution via Celery workers
- Real-time status updates
- Latency: Response time measurement
- Cost: Token usage and cost calculation
- Quality: Semantic similarity scoring
- Analytics: Dashboard with charts and statistics
- OAuth integration (Google, GitHub)
- Secure session management
- Protected routes
- User-specific prompts and test runs
# Check logs
make docker-logs
# Rebuild images
make docker-build
# Restart services
make docker-restart# Reset database
docker-compose down -v
docker-compose up -d postgres redis
make setup-dbIf ports 3000, 4000, 8000, 5433, or 6379 are in use, either:
- Stop the conflicting services
- Update port mappings in
docker-compose.yml
- Ensure
NEXT_PUBLIC_API_URLis set correctly - Check that backend-node is running on port 4000
- Verify CORS is configured in
backend-node/src/index.ts
- Check Celery worker logs:
docker-compose logs celery-worker - Verify Redis is running:
docker-compose ps redis - Ensure LLM API keys are set in environment variables
Use the examples in DUMMY_PROMPTS.md to create test prompts.
- Create a prompt in the UI
- Go to Test Runs page
- Click "Test Run" button
- Select prompt and provide test input (JSON format)
- View results with metrics
- Frontend README - Frontend-specific documentation
- Node.js Backend README - Backend API documentation
- Python Backend README - LLM execution engine docs
- Contributing Guidelines - How to contribute
- Dummy Prompts - Example prompts for testing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
This project is licensed under the MIT License.
Sushant R. Dangal