A command-line tool for IVR/contact center prompt generation (text-to-speech) and audio transcription (speech-to-text).
macOS / Linux:
curl -fsSL https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/install.sh | shWindows (PowerShell):
irm https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/install.ps1 | iexOr download from Releases.
You need at least one TTS provider key to generate prompts, and one STT provider key to transcribe audio.
One API key covers both Text-to-Speech and Speech-to-Text.
- Go to Google Cloud Console
- Create a project or select an existing one
- Enable the Cloud Text-to-Speech API (for Neural2, Studio, Wavenet, Chirp voices)
- Enable the Generative Language API (for Gemini voices)
- Enable the Cloud Speech-to-Text API (for transcription)
- Go to APIs & Services > Credentials
- Click Create Credentials > API Key
- Click Edit API Key and under API restrictions, select Restrict key and add all three APIs:
- Cloud Text-to-Speech API
- Generative Language API
- Cloud Speech-to-Text API
- Copy the key
New accounts get $300 in free credits. TTS pricing is ~$4 per 1M characters for Neural2 voices, ~$16/1M for Chirp3-HD.
- Sign up at elevenlabs.io
- Go to Profile + API Key
- Create an API key with these permissions: Text to Speech > Access, Voices > Read, Models > Access
- Copy your API key
Free tier includes limited characters per month. Paid plans start at $5/mo.
- Sign up at assemblyai.com
- Go to your Dashboard
- Copy your API key from the sidebar
Free tier includes transcription hours. Pay-as-you-go at $0.37/hour after that.
One API key covers both Text-to-Speech and Speech-to-Text (transcription).
- Sign up at platform.openai.com
- Go to API Keys
- Click Create new secret key
- Copy your API key
Pay-as-you-go pricing. TTS at $0.015/1K characters (tts-1) or $0.030/1K (tts-1-hd). Transcription at $0.006/minute (Whisper) or $0.01/minute (GPT-4o-transcribe).
# Interactive setup (recommended for first run)
prompt-tools setup
# Or set keys individually
prompt-tools config set-api-key google
prompt-tools config set-api-key elevenlabs
prompt-tools config set-api-key assemblyai
prompt-tools config set-api-key openaiKeys are stored in your OS keyring (macOS Keychain / Linux keyring / Windows Credential Manager), never in plain text files.
# Interactive setup (API keys, defaults)
prompt-tools setup
# Generate a prompt
prompt-tools speak "Welcome to customer support." -o welcome.wav
# Use a Gemini voice (highest quality)
prompt-tools speak "Welcome to customer support." --voice Achernar -o welcome.wav
# List available voices
prompt-tools voices --language en-US --output table
# Bulk generate from spreadsheet
prompt-tools bulk template --output prompts.xlsx # Create template
prompt-tools bulk generate --file prompts.xlsx --output-dir ./output
# Transcribe audio
prompt-tools transcribe --file recording.wav400+ voices across multiple model families. Default provider.
| Model | Quality | Example Voice | API Used | Notes |
|---|---|---|---|---|
| Gemini | Highest | Achernar, Kore, Puck |
Generative Language | Bare names, auto-selects best model |
| Chirp3-HD | High | en-US-Chirp3-HD-Achernar |
Cloud TTS | Same voices, different model |
| Studio | High | en-US-Studio-O |
Cloud TTS | Studio-grade |
| Neural2 | Good | en-US-Neural2-F |
Cloud TTS | Neural voices |
| Wavenet | Good | en-US-Wavenet-A |
Cloud TTS | DeepMind Wavenet |
| Standard | Basic | en-US-Standard-A |
Cloud TTS | Concatenative |
Gemini voices use the Generative Language API (must be enabled separately). The best available Gemini TTS model is auto-selected. Override with --model:
prompt-tools speak "Hello" --voice Kore --model gemini-2.5-flash-preview-tts -o hello.wavPremium natural voices. Output is converted to IVR-compatible formats (mu-law/A-law WAV) automatically. Voices can be specified by name (e.g., Sarah, Roger) or voice ID.
| Model | Quality | Notes |
|---|---|---|
eleven_v3 |
Highest | Latest model (default) |
eleven_multilingual_v2 |
High | Multilingual |
eleven_flash_v2_5 |
Good | Fast, low latency |
eleven_turbo_v2_5 |
Good | Low latency, multilingual |
prompt-tools speak "Hello" --provider elevenlabs --voice Sarah -o hello.wav
prompt-tools speak "Hello" --provider elevenlabs --voice Sarah --model eleven_multilingual_v2 -o hello.wav
prompt-tools voices --provider elevenlabs --output tableHigh quality natural voices with a simple API. Output is converted to IVR-compatible formats automatically. Voices: alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, shimmer, verse.
| Model | Quality | Notes |
|---|---|---|
gpt-4o-mini-tts |
High | Default, most capable |
tts-1 |
Standard | Lower latency |
tts-1-hd |
High | High definition |
prompt-tools speak "Hello" --provider openai --voice alloy -o hello.wav
prompt-tools speak "Hello" --provider openai --voice nova --model tts-1-hd -o hello.wav
prompt-tools voices --provider openai --output tableSync recognition for short audio, phrase boosting, word-level timestamps.
Async transcription with polling, high accuracy, automatic punctuation.
prompt-tools transcribe --file recording.wav --provider assemblyaiSynchronous transcription using Whisper and GPT-4o models. Supports word timestamps and phrase boosting via prompt.
prompt-tools transcribe --file recording.wav --provider openaiGenerate hundreds of prompts from a spreadsheet. Supports .xlsx and .csv.
# Create a template
prompt-tools bulk template --output prompts.xlsx
# Validate without generating
prompt-tools bulk validate --file prompts.xlsx
# Generate all prompts
prompt-tools bulk generate --file prompts.xlsx --output-dir ./output
# With options
prompt-tools bulk generate --file prompts.csv --output-dir ./output \
--concurrency 10 --skip-existing --continue-on-error| Filename | Voice | Text | SSML | Sample Rate | Encoding | Notes |
|---|---|---|---|---|---|---|
| welcome.wav | en-US-Chirp3-HD-Achernar | Welcome to support. | no | Main greeting | ||
| es-MX/welcome.wav | es-MX-Chirp3-HD-A | Bienvenido. | no | Subdirectory | ||
| transfer.wav | Achernar | Hold please. | no | Gemini voice | ||
| #holiday.wav | en-US-Chirp3-HD-Achernar | Closed for holiday. | no | Skipped |
- Rows starting with
#are skipped - Voice, Sample Rate, and Encoding are optional (defaults from config)
- Filename supports subdirectories — folders are created automatically
# Transcribe a directory
prompt-tools batch-transcribe --dir ./recordings --output-dir ./transcripts
# Specific files
prompt-tools batch-transcribe --files "a.wav,b.wav" --output-format csv
# With concurrency
prompt-tools batch-transcribe --dir ./recordings --concurrency 10 --continue-on-errorAPI keys are stored in the OS keyring (macOS Keychain / Linux keyring / Windows Credential Manager).
prompt-tools setup # Interactive wizard
prompt-tools config set-api-key google # Set Google API key
prompt-tools config set-api-key elevenlabs # Set ElevenLabs API key
prompt-tools config set-api-key assemblyai # Set AssemblyAI API key
prompt-tools config set-api-key openai # Set OpenAI API key
prompt-tools config clear-api-key google # Remove a key
prompt-tools config show # Show config and key statusKey resolution order: environment variable (GOOGLE_API_KEY, ELEVENLABS_API_KEY, ASSEMBLYAI_API_KEY, OPENAI_API_KEY) > OS keyring.
Default output is 8kHz mu-law WAV — the North American IVR/telephony standard.
| Use Case | Sample Rate | Encoding | Flags |
|---|---|---|---|
| North American IVR | 8000 | mulaw | (default) |
| European IVR | 8000 | alaw | --encoding alaw |
| Wideband / modern | 16000 | linear16 | --sample-rate 16000 --encoding linear16 |
| General purpose | — | mp3 | --format mp3 |
prompt-tools config set-sample-rate 8000
prompt-tools config set-encoding mulaw
prompt-tools config set-format wavControl output with --output:
| Format | Description |
|---|---|
json |
Pretty-printed JSON (default) |
table |
ASCII table with terminal-width formatting |
csv |
CSV with headers |
raw |
Raw output |
prompt-tools voices --language en-US --output table| Flag | Description |
|---|---|
--output json|table|csv|raw |
Output format (default: json) |
--debug |
Show HTTP request/response details |
--dry-run |
Show plan without executing |
# Zsh
prompt-tools completion zsh > "${fpath[1]}/_prompt-tools"
# Bash
prompt-tools completion bash > /etc/bash_completion.d/prompt-tools
# Fish
prompt-tools completion fish > ~/.config/fish/completions/prompt-tools.fishA skill file is included at skill/SKILL.md that teaches AI coding agents how to use the CLI.
The installer and prompt-tools post-install command will offer to install the skill for detected agents (Claude Code, Claude Cowork, OpenAI Codex, Cursor) via an interactive menu. Skills are also kept up to date when you run prompt-tools update.
If you prefer to install manually:
Claude Code:
mkdir -p ~/.claude/skills/prompt-tools
curl -fsSL https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/skill/SKILL.md \
-o ~/.claude/skills/prompt-tools/SKILL.mdOpenAI Codex:
mkdir -p ~/.codex/skills/prompt-tools
curl -fsSL https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/skill/SKILL.md \
-o ~/.codex/skills/prompt-tools/SKILL.mdCursor:
mkdir -p ~/.cursor/skills/prompt-tools
curl -fsSL https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/skill/SKILL.md \
-o ~/.cursor/skills/prompt-tools/SKILL.mdClaude Cowork: Run prompt-tools post-install to generate the ZIP, then upload at: Claude Desktop → Cowork tab → Customize → Skills → + → Upload a skill.
For project-specific installation, place the skill file in your project directory instead of the user-level folder.
See CLAUDE.md for project structure and conventions.
make build # Build binary
make check # Build + go vet
go test ./... # Run testsMIT