Skip to content

Cloverhound/prompt-tools-cli

Repository files navigation

Prompt Tools CLI

A command-line tool for IVR/contact center prompt generation (text-to-speech) and audio transcription (speech-to-text).

Install

macOS / Linux:

curl -fsSL https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/install.sh | sh

Windows (PowerShell):

irm https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/install.ps1 | iex

Or download from Releases.

Getting API Keys

You need at least one TTS provider key to generate prompts, and one STT provider key to transcribe audio.

Google Cloud (TTS + STT)

One API key covers both Text-to-Speech and Speech-to-Text.

  1. Go to Google Cloud Console
  2. Create a project or select an existing one
  3. Enable the Cloud Text-to-Speech API (for Neural2, Studio, Wavenet, Chirp voices)
  4. Enable the Generative Language API (for Gemini voices)
  5. Enable the Cloud Speech-to-Text API (for transcription)
  6. Go to APIs & Services > Credentials
  7. Click Create Credentials > API Key
  8. Click Edit API Key and under API restrictions, select Restrict key and add all three APIs:
    • Cloud Text-to-Speech API
    • Generative Language API
    • Cloud Speech-to-Text API
  9. Copy the key

New accounts get $300 in free credits. TTS pricing is ~$4 per 1M characters for Neural2 voices, ~$16/1M for Chirp3-HD.

ElevenLabs (TTS)

  1. Sign up at elevenlabs.io
  2. Go to Profile + API Key
  3. Create an API key with these permissions: Text to Speech > Access, Voices > Read, Models > Access
  4. Copy your API key

Free tier includes limited characters per month. Paid plans start at $5/mo.

AssemblyAI (STT)

  1. Sign up at assemblyai.com
  2. Go to your Dashboard
  3. Copy your API key from the sidebar

Free tier includes transcription hours. Pay-as-you-go at $0.37/hour after that.

OpenAI (TTS + STT)

One API key covers both Text-to-Speech and Speech-to-Text (transcription).

  1. Sign up at platform.openai.com
  2. Go to API Keys
  3. Click Create new secret key
  4. Copy your API key

Pay-as-you-go pricing. TTS at $0.015/1K characters (tts-1) or $0.030/1K (tts-1-hd). Transcription at $0.006/minute (Whisper) or $0.01/minute (GPT-4o-transcribe).

Store Your Keys

# Interactive setup (recommended for first run)
prompt-tools setup

# Or set keys individually
prompt-tools config set-api-key google
prompt-tools config set-api-key elevenlabs
prompt-tools config set-api-key assemblyai
prompt-tools config set-api-key openai

Keys are stored in your OS keyring (macOS Keychain / Linux keyring / Windows Credential Manager), never in plain text files.

Quick Start

# Interactive setup (API keys, defaults)
prompt-tools setup

# Generate a prompt
prompt-tools speak "Welcome to customer support." -o welcome.wav

# Use a Gemini voice (highest quality)
prompt-tools speak "Welcome to customer support." --voice Achernar -o welcome.wav

# List available voices
prompt-tools voices --language en-US --output table

# Bulk generate from spreadsheet
prompt-tools bulk template --output prompts.xlsx    # Create template
prompt-tools bulk generate --file prompts.xlsx --output-dir ./output

# Transcribe audio
prompt-tools transcribe --file recording.wav

TTS Providers

Google Cloud TTS

400+ voices across multiple model families. Default provider.

Model Quality Example Voice API Used Notes
Gemini Highest Achernar, Kore, Puck Generative Language Bare names, auto-selects best model
Chirp3-HD High en-US-Chirp3-HD-Achernar Cloud TTS Same voices, different model
Studio High en-US-Studio-O Cloud TTS Studio-grade
Neural2 Good en-US-Neural2-F Cloud TTS Neural voices
Wavenet Good en-US-Wavenet-A Cloud TTS DeepMind Wavenet
Standard Basic en-US-Standard-A Cloud TTS Concatenative

Gemini voices use the Generative Language API (must be enabled separately). The best available Gemini TTS model is auto-selected. Override with --model:

prompt-tools speak "Hello" --voice Kore --model gemini-2.5-flash-preview-tts -o hello.wav

ElevenLabs

Premium natural voices. Output is converted to IVR-compatible formats (mu-law/A-law WAV) automatically. Voices can be specified by name (e.g., Sarah, Roger) or voice ID.

Model Quality Notes
eleven_v3 Highest Latest model (default)
eleven_multilingual_v2 High Multilingual
eleven_flash_v2_5 Good Fast, low latency
eleven_turbo_v2_5 Good Low latency, multilingual
prompt-tools speak "Hello" --provider elevenlabs --voice Sarah -o hello.wav
prompt-tools speak "Hello" --provider elevenlabs --voice Sarah --model eleven_multilingual_v2 -o hello.wav
prompt-tools voices --provider elevenlabs --output table

OpenAI

High quality natural voices with a simple API. Output is converted to IVR-compatible formats automatically. Voices: alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, shimmer, verse.

Model Quality Notes
gpt-4o-mini-tts High Default, most capable
tts-1 Standard Lower latency
tts-1-hd High High definition
prompt-tools speak "Hello" --provider openai --voice alloy -o hello.wav
prompt-tools speak "Hello" --provider openai --voice nova --model tts-1-hd -o hello.wav
prompt-tools voices --provider openai --output table

STT Providers

Google Cloud STT

Sync recognition for short audio, phrase boosting, word-level timestamps.

AssemblyAI

Async transcription with polling, high accuracy, automatic punctuation.

prompt-tools transcribe --file recording.wav --provider assemblyai

OpenAI

Synchronous transcription using Whisper and GPT-4o models. Supports word timestamps and phrase boosting via prompt.

prompt-tools transcribe --file recording.wav --provider openai

Bulk Processing

Generate hundreds of prompts from a spreadsheet. Supports .xlsx and .csv.

# Create a template
prompt-tools bulk template --output prompts.xlsx

# Validate without generating
prompt-tools bulk validate --file prompts.xlsx

# Generate all prompts
prompt-tools bulk generate --file prompts.xlsx --output-dir ./output

# With options
prompt-tools bulk generate --file prompts.csv --output-dir ./output \
  --concurrency 10 --skip-existing --continue-on-error

Spreadsheet Format

Filename Voice Text SSML Sample Rate Encoding Notes
welcome.wav en-US-Chirp3-HD-Achernar Welcome to support. no Main greeting
es-MX/welcome.wav es-MX-Chirp3-HD-A Bienvenido. no Subdirectory
transfer.wav Achernar Hold please. no Gemini voice
#holiday.wav en-US-Chirp3-HD-Achernar Closed for holiday. no Skipped
  • Rows starting with # are skipped
  • Voice, Sample Rate, and Encoding are optional (defaults from config)
  • Filename supports subdirectories — folders are created automatically

Batch Transcription

# Transcribe a directory
prompt-tools batch-transcribe --dir ./recordings --output-dir ./transcripts

# Specific files
prompt-tools batch-transcribe --files "a.wav,b.wav" --output-format csv

# With concurrency
prompt-tools batch-transcribe --dir ./recordings --concurrency 10 --continue-on-error

Authentication

API keys are stored in the OS keyring (macOS Keychain / Linux keyring / Windows Credential Manager).

prompt-tools setup                          # Interactive wizard
prompt-tools config set-api-key google      # Set Google API key
prompt-tools config set-api-key elevenlabs  # Set ElevenLabs API key
prompt-tools config set-api-key assemblyai  # Set AssemblyAI API key
prompt-tools config set-api-key openai      # Set OpenAI API key
prompt-tools config clear-api-key google    # Remove a key
prompt-tools config show                    # Show config and key status

Key resolution order: environment variable (GOOGLE_API_KEY, ELEVENLABS_API_KEY, ASSEMBLYAI_API_KEY, OPENAI_API_KEY) > OS keyring.

Audio Formats

Default output is 8kHz mu-law WAV — the North American IVR/telephony standard.

Use Case Sample Rate Encoding Flags
North American IVR 8000 mulaw (default)
European IVR 8000 alaw --encoding alaw
Wideband / modern 16000 linear16 --sample-rate 16000 --encoding linear16
General purpose mp3 --format mp3
prompt-tools config set-sample-rate 8000
prompt-tools config set-encoding mulaw
prompt-tools config set-format wav

Output Formats

Control output with --output:

Format Description
json Pretty-printed JSON (default)
table ASCII table with terminal-width formatting
csv CSV with headers
raw Raw output
prompt-tools voices --language en-US --output table

Global Flags

Flag Description
--output json|table|csv|raw Output format (default: json)
--debug Show HTTP request/response details
--dry-run Show plan without executing

Shell Completions

# Zsh
prompt-tools completion zsh > "${fpath[1]}/_prompt-tools"

# Bash
prompt-tools completion bash > /etc/bash_completion.d/prompt-tools

# Fish
prompt-tools completion fish > ~/.config/fish/completions/prompt-tools.fish

Coding Agent Skill

A skill file is included at skill/SKILL.md that teaches AI coding agents how to use the CLI.

Automatic Setup

The installer and prompt-tools post-install command will offer to install the skill for detected agents (Claude Code, Claude Cowork, OpenAI Codex, Cursor) via an interactive menu. Skills are also kept up to date when you run prompt-tools update.

Manual Setup

If you prefer to install manually:

Claude Code:

mkdir -p ~/.claude/skills/prompt-tools
curl -fsSL https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/skill/SKILL.md \
  -o ~/.claude/skills/prompt-tools/SKILL.md

OpenAI Codex:

mkdir -p ~/.codex/skills/prompt-tools
curl -fsSL https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/skill/SKILL.md \
  -o ~/.codex/skills/prompt-tools/SKILL.md

Cursor:

mkdir -p ~/.cursor/skills/prompt-tools
curl -fsSL https://raw.githubusercontent.com/Cloverhound/prompt-tools-cli/main/skill/SKILL.md \
  -o ~/.cursor/skills/prompt-tools/SKILL.md

Claude Cowork: Run prompt-tools post-install to generate the ZIP, then upload at: Claude Desktop → Cowork tab → Customize → Skills → + → Upload a skill.

For project-specific installation, place the skill file in your project directory instead of the user-level folder.

Development

See CLAUDE.md for project structure and conventions.

make build    # Build binary
make check    # Build + go vet
go test ./... # Run tests

License

MIT

About

CLI for IVR/contact center prompt generation (TTS) and transcription (STT)

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages