Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion ai_agents/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,10 @@ LITELLM_MODEL=gpt-4o-mini
# Deepgram ASR key
DEEPGRAM_API_KEY=

# Extension: xai_asr_python, xai_tts_python
# xAI Voice API key
XAI_API_KEY=

# Azure ASR
AZURE_ASR_API_KEY=
AZURE_ASR_REGION=
Expand Down Expand Up @@ -223,4 +227,4 @@ ALIYUN_ANALYTICDB_NAMESPACE_PASSWORD=

# Sarvam
SARVAM_ASR_KEY=
SARVAM_TTS_KEY=
SARVAM_TTS_KEY=
44 changes: 39 additions & 5 deletions ai_agents/agents/examples/voice-assistant/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Voice Assistant

A comprehensive voice assistant with real-time conversation capabilities using Agora RTC, Deepgram STT, OpenAI LLM, and ElevenLabs TTS.
A configurable voice assistant with real-time conversation capabilities using Agora RTC, interchangeable STT/TTS providers, and an OpenAI-compatible LLM.

## Features

Expand All @@ -13,14 +13,25 @@ A comprehensive voice assistant with real-time conversation capabilities using A
1. **Agora Account**: Get credentials from [Agora Console](https://console.agora.io/)
- `AGORA_APP_ID` - Your Agora App ID (required)

2. **Deepgram Account**: Get credentials from [Deepgram Console](https://console.deepgram.com/)
- `DEEPGRAM_API_KEY` - Your Deepgram API key (required)
2. **STT Provider**: choose the graph you want to run
- `DEEPGRAM_API_KEY` for the default `voice_assistant` graph
- `XAI_API_KEY` for `voice_assistant_xai_asr` or `voice_assistant_xai_full`

3. **OpenAI Account**: Get credentials from [OpenAI Platform](https://platform.openai.com/)
- `OPENAI_API_KEY` - Your OpenAI API key (required)

4. **ElevenLabs Account**: Get credentials from [ElevenLabs](https://elevenlabs.io/)
4. **TTS Provider**: choose the graph you want to run
- `ELEVENLABS_TTS_KEY` for the default `voice_assistant` graph or `voice_assistant_xai_asr`
- `XAI_API_KEY` for `voice_assistant_xai_tts` or `voice_assistant_xai_full`

### Provider-specific keys

- **Deepgram Account**: Get credentials from [Deepgram Console](https://console.deepgram.com/)
- `DEEPGRAM_API_KEY` - Your Deepgram API key (required)
- **ElevenLabs Account**: Get credentials from [ElevenLabs](https://elevenlabs.io/)
- `ELEVENLABS_TTS_KEY` - Your ElevenLabs API key (required)
- **xAI Account**: Get credentials from [xAI Console](https://console.x.ai/)
- `XAI_API_KEY` - Your xAI Voice API key (required for xAI STT/TTS graphs)

### Optional Environment Variables

Expand Down Expand Up @@ -51,6 +62,9 @@ OPENAI_PROXY_URL=your_proxy_url_here
# ElevenLabs (required for text-to-speech)
ELEVENLABS_TTS_KEY=your_elevenlabs_api_key_here

# xAI (required for xAI speech-to-text and/or text-to-speech graphs)
XAI_API_KEY=your_xai_api_key_here

# Optional
WEATHERAPI_API_KEY=your_weather_api_key_here
```
Expand All @@ -71,14 +85,33 @@ cd agents/examples/voice-assistant
task run
```

The voice assistant starts with all capabilities enabled.
The stack starts the TEN app, API server, frontend, and TMAN Designer.

### 4. Access the Application

- **Frontend**: http://localhost:3000
- **API Server**: http://localhost:8080
- **TMAN Designer**: http://localhost:49483

### 5. Choose a Graph

The frontend reads the `graph` URL query parameter and matches it against
`tenapp/property.json` `predefined_graphs[].name`.

Available graph names:

- `voice_assistant` - Deepgram STT + OpenAI-compatible LLM + ElevenLabs TTS
- `voice_assistant_xai_asr` - xAI STT + OpenAI-compatible LLM + ElevenLabs TTS
- `voice_assistant_xai_tts` - Deepgram STT + OpenAI-compatible LLM + xAI TTS
- `voice_assistant_xai_full` - xAI STT + OpenAI-compatible LLM + xAI TTS

Examples:

```text
http://localhost:3000/?graph=voice_assistant_xai_full
https://ten-demo.agora.io/?graph=voice_assistant_xai_full
```

## Configuration

The voice assistant is configured in `tenapp/property.json`:
Expand Down Expand Up @@ -189,6 +222,7 @@ docker run --rm -it --env-file .env -p 8080:8080 -p 3000:3000 voice-assistant-ap

- [Agora RTC Documentation](https://docs.agora.io/en/rtc/overview/product-overview)
- [Deepgram API Documentation](https://developers.deepgram.com/)
- [xAI API Documentation](https://docs.x.ai/)
- [OpenAI API Documentation](https://platform.openai.com/docs)
- [ElevenLabs API Documentation](https://docs.elevenlabs.io/)
- [TEN Framework Documentation](https://doc.theten.ai)
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,9 @@
{
"path": "../../../ten_packages/extension/tencent_asr_python"
},
{
"path": "../../../ten_packages/extension/xai_asr_python"
},
{
"path": "../../../ten_packages/extension/xfyun_asr_bigmodel_python"
},
Expand Down Expand Up @@ -141,6 +144,9 @@
{
"path": "../../../ten_packages/extension/tencent_tts_python"
},
{
"path": "../../../ten_packages/extension/xai_tts_python"
},
{
"path": "../../../ten_packages/extension/message_collector2"
},
Expand Down
Loading
Loading