Feature Contribution: Add support for ElevenLabs TTS voices. by evetzyokozuna · Pull Request #1073 · agent0ai/agent-zero

evetzyokozuna · 2026-02-18T17:37:38Z

Summary

This PR adds and stabilizes an ElevenLabs-based voice output path in Agent Zero, alongside existing browser/Kokoro speech behavior.
It updates backend API handling, frontend speech routing, and settings UX so ElevenLabs can be enabled as an optional provider without regressing default behavior.

Files covered

python/api/el11_tts.py
webui/components/chat/speech/speech-store.js
webui/components/settings/agent/speech.html
requirements.txt

Problem Statement

Voice output through kokoro was OK, but for those wanting a more human like voice for their agent-zero implementation, allow for custom voices from ElevenLabs.

This PR implements an additional capability to use ElevenLabs voices.

What this PR changes

1) `python/api/el11_tts.py` — ElevenLabs proxy API endpoint

Purpose

Provide a server-side TTS proxy endpoint (/el11_tts) that:

accepts text input from the UI
resolves active voice profile configuration
calls ElevenLabs with server-side credentials
returns playable audio/mpeg data to the client

Behavior

Expects payload like:
- text (required)
- profile (optional, defaults to active profile)
Loads per-agent voice config from:
- agents/<profile>/elevenlabs_voice.json
Uses environment key:
- EL11_API_KEY
Returns:
- audio stream bytes (MPEG) on success
- structured JSON error payload on failure

Why this matters

Keeps API key off the browser
Enables profile-specific voice identity
Creates a clean TTS backend interface that can be reused for telephony paths later

2) `webui/components/chat/speech/speech-store.js` — speech provider routing + playback

Purpose

Add real speech routing support for ElevenLabs in the existing TTS flow.

Behavior added

provider gating checks for ElevenLabs mode (via local settings/toggle)
new ElevenLabs speech path that calls /el11_tts
robust audio playback for returned audio blobs
fallback behavior retained:
- if ElevenLabs fails, existing Kokoro/browser behavior still works
existing stream/chunk speech flow remains intact

Why this matters

The UI can now actually use ElevenLabs audio, not just display a toggle
Preserves backward compatibility for users not enabling ElevenLabs

3) `webui/components/settings/agent/speech.html` — settings UX

Purpose

Expose a clear user-facing toggle for ElevenLabs proxy TTS in the Speech settings panel.

Behavior added

an explicit “Enable ElevenLabs TTS Proxy” control
UX text clarifying this uses the server proxy route and requires configured key/config

Why this matters

Provides discoverable, controllable behavior from UI
Aligns user intent with actual provider routing in speech-store

4) `requirements.txt` — dependency/runtime parity

Purpose

Align dependency set with runtime expectations for the ElevenLabs integration path and live environment stability.

Why this matters

Reduces “works in one environment but not another” drift
Supports reproducible deployments and clean runtime behavior

Configuration and Usage

Required env

EL11_API_KEY=<your_elevenlabs_key>

Required voice config

Place elevenlabs_voice.json in relevant agent directories, e.g.:

agents/agent0/elevenlabs_voice.json
agents/default/elevenlabs_voice.json
etc.

Example fields:

voice_id
model
stability
similarity_boost
style
optional quality-related settings as supported by endpoint

Enable in UI

Open Settings -> Agent -> Speech
Enable ElevenLabs TTS Proxy
Trigger any voice output path in chat

Backward Compatibility

Default speech behavior remains unchanged unless ElevenLabs mode is enabled.
Kokoro/browser fallback paths remain available.
Existing speech chunking and stream sequencing logic remains preserved.

Security Considerations

ElevenLabs API key remains server-side (not exposed to browser code).
Frontend calls local authenticated endpoint (/el11_tts) rather than external API directly.
Profile-based config loading is constrained to expected agent config files.

Validation / Test Notes

Manual checks performed

endpoint registration and availability for /el11_tts
valid audio response path (content-type: audio/mpeg)
frontend served assets include ElevenLabs routing logic
settings toggle rendered and persisted in UI
fallback behavior sanity checked

Suggested reviewer checks

verify speech quality changes when ElevenLabs toggle is enabled
verify fallback when ElevenLabs key/config is missing
verify no regressions in browser/Kokoro modes
verify multi-agent profile voice switching behavior

Known Limitations / Follow-ups

current control for provider mode is toggle-based; future refinement can consolidate into a single tts_mode setting for stronger clarity.
telemetry around provider selection/fallback reason could be added for troubleshooting.
future telephony integration may reuse /el11_tts shape or move to provider abstraction layer.

Why this PR is valuable

This change turns ElevenLabs support from “partial wiring + config files” into a working, testable, user-selectable voice path in Agent Zero.
It is designed to preserve current behavior while enabling higher-quality voice output now and cleaner voice-provider extensibility going forward.

… proxy (profile-based streaming)

…lob fallback, profile evetz)

…ech tab, bound to localStorage speech.el11Server

…etio (requirements.txt freeze)

…profile, create-skill update

…arity_boost

…ity_boost

… post method, request.json(), absolute paths)

…mods, speech UI, dashboard dir, flask/bak files)

… all .bak files & org.html.bak.orgchart.v1

…bak.py, .bak3, .bak4)

….py kept)

EL11 TTS Proxy Implementation (feature/elevenlabs)

…ntime (resolve post-merge mismatch)

fix: Sync el11_tts.py & requirements.txt to live runtime (Flask hybrid)

evetzyokozuna added 26 commits February 17, 2026 04:10

1/3 feat: add Evetz ElevenLabs voice_id.txt and backend /api/el11_tts…

673f03f

… proxy (profile-based streaming)

2/3 feat: speech-store.js EL11 server proxy toggle (MSE streaming + b…

9a7a418

…lob fallback, profile evetz)

feat: add evetz profile voice_id.txt for el11

e53db18

feat(el11): add /api/el11_tts FastAPI streaming proxy

85b869b

Delete agents/evetz directory

314b258

Delete agents/evetz directory

8ad93f6

feat(speech): Add ElevenLabs TTS proxy toggle to Agent Settings > Spe…

13d4b15

…ech tab, bound to localStorage speech.el11Server

chore(deps): add EL11 requirements httpx python-multipart python-sock…

e035f94

…etio (requirements.txt freeze)

feat: EL json config per profile, backend JSON load, frontend agent0 …

f34817e

…profile, create-skill update

fix: frontend TTS route /el11_tts (backend match, 405 fix)

b6bb6d3

feat(developer): Assign OpenRouter GPT-5.3-Codex model (Developer-Codex)

1033c53

gitignore local EL voice configs; rm from tracking

d448cff

fix(el11_tts.py): syntax error, multi-line voice_settings w/ style/cl…

229d236

…arity_boost

fix(el11_tts.py): perfect syntax/indent for voice_settings style/clar…

818552e

…ity_boost

fix(el11_tts.py): line2 import + full syntax (helpers.api no python.,…

d2f535a

… post method, request.json(), absolute paths)

fix(el11_tts.py): prepend sys.path for helpers.api import from api/ cwd

d0fa814

fix(el11_tts.py): bulletproof fastapi load for framework

6528b95

feat(el11_tts.py): Flask hybrid no fastapi crash

ee76021

Housekeeping: Commit remaining changes post-EL11 implementation (TTS …

deb2944

…mods, speech UI, dashboard dir, flask/bak files)

Revert non-EL11 changes: agents/* & skills/* to upstream main; remove…

b0b6168

… all .bak files & org.html.bak.orgchart.v1

Cleanup EL11: Revert .gitignore to upstream; rm remaining TTS baks (.…

f6110eb

…bak.py, .bak3, .bak4)

Remove el11_tts.flask.py (pure Flask variant; primary hybrid el11_tts…

fa197cd

….py kept)

Merge branch 'main' into feature/elevenlabs

ef65641

Merge pull request #1 from evetzyokozuna/feature/elevenlabs

e3ea132

EL11 TTS Proxy Implementation (feature/elevenlabs)

fix: Sync el11_tts.py & requirements.txt to live /a0/ Flask hybrid ru…

e6e9460

…ntime (resolve post-merge mismatch)

Merge pull request #2 from evetzyokozuna/fix-live-sync

9e9c178

fix: Sync el11_tts.py & requirements.txt to live runtime (Flask hybrid)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Contribution: Add support for ElevenLabs TTS voices.#1073

Feature Contribution: Add support for ElevenLabs TTS voices.#1073
evetzyokozuna wants to merge 26 commits intoagent0ai:mainfrom
evetzyokozuna:main

evetzyokozuna commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Uh oh!

Conversation

evetzyokozuna commented Feb 18, 2026

Summary

Files covered

Problem Statement

What this PR changes

1) python/api/el11_tts.py — ElevenLabs proxy API endpoint

Purpose

Behavior

Why this matters

2) webui/components/chat/speech/speech-store.js — speech provider routing + playback

Purpose

Behavior added

Why this matters

3) webui/components/settings/agent/speech.html — settings UX

Purpose

Behavior added

Why this matters

4) requirements.txt — dependency/runtime parity

Purpose

Why this matters

Configuration and Usage

Required env

Required voice config

Enable in UI

Backward Compatibility

Security Considerations

Validation / Test Notes

Manual checks performed

Suggested reviewer checks

Known Limitations / Follow-ups

Why this PR is valuable

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

1) `python/api/el11_tts.py` — ElevenLabs proxy API endpoint

2) `webui/components/chat/speech/speech-store.js` — speech provider routing + playback

3) `webui/components/settings/agent/speech.html` — settings UX

4) `requirements.txt` — dependency/runtime parity