Feature request
Add Telnyx Speech-to-Text support for LiveKit Agents JS, ideally with an ergonomic API such as STT.withTelnyx(...) or a dedicated @livekit/agents-plugin-telnyx STT plugin.
Motivation
agents-js currently has LLM.withTelnyx(...) in the OpenAI-compatible LLM plugin, but there does not appear to be an equivalent STT integration for Telnyx. Telnyx now provides a standalone real-time STT WebSocket endpoint, which is useful for voice agents that already use Telnyx as a provider and want low-latency streaming transcription in Node agents.
This should not be implemented as a plain OpenAI-compatible transcription wrapper: Telnyx's STT WebSocket protocol uses query-string configuration, Authorization: Bearer ..., binary audio frames, and transcript JSON messages such as { "transcript": "...", "is_final": true, "confidence": 0.98 }.
References
Suggested API shape
Either of these would be useful:
import { STT } from '@livekit/agents-plugin-openai';
const stt = STT.withTelnyx({
apiKey: process.env.TELNYX_API_KEY,
language: 'en-US',
transcriptionEngine: 'Telnyx',
interimResults: true,
});
or:
import * as telnyx from '@livekit/agents-plugin-telnyx';
const stt = new telnyx.STT({
apiKey: process.env.TELNYX_API_KEY,
language: 'en-US',
transcriptionEngine: 'Telnyx',
interimResults: true,
});
The exact API is flexible; the important part is first-class support for Telnyx's WebSocket STT transport.
Acceptance criteria
- Supports streaming STT over
wss://api.telnyx.com/v2/speech-to-text/transcription.
- Reads
apiKey from TELNYX_API_KEY when not explicitly provided.
- Supports Telnyx WebSocket parameters such as
transcription_engine, model, language, input_format, sample_rate, and interim_results where applicable.
- Streams audio using the Telnyx wire protocol, i.e. binary audio frames rather than OpenAI realtime
input_audio_buffer.append JSON/base64 frames.
- Maps Telnyx transcript messages into LiveKit STT events:
START_OF_SPEECH
INTERIM_TRANSCRIPT
FINAL_TRANSCRIPT
END_OF_SPEECH
- Handles graceful close/finalization according to Telnyx's engine-specific WebSocket behavior.
- Includes tests for WebSocket URL construction, auth headers, audio frame forwarding, transcript event mapping, and close/error handling.
Notes
The Python Telnyx STT plugin looks like a useful behavioral reference: it connects with Bearer auth, passes STT options as WebSocket query parameters, streams audio, and converts Telnyx { transcript, is_final, confidence } messages into LiveKit STT events. A JS implementation could reuse the existing stt.STT / stt.SpeechStream patterns from the Deepgram or AssemblyAI plugins.
Feature request
Add Telnyx Speech-to-Text support for LiveKit Agents JS, ideally with an ergonomic API such as
STT.withTelnyx(...)or a dedicated@livekit/agents-plugin-telnyxSTT plugin.Motivation
agents-jscurrently hasLLM.withTelnyx(...)in the OpenAI-compatible LLM plugin, but there does not appear to be an equivalent STT integration for Telnyx. Telnyx now provides a standalone real-time STT WebSocket endpoint, which is useful for voice agents that already use Telnyx as a provider and want low-latency streaming transcription in Node agents.This should not be implemented as a plain OpenAI-compatible transcription wrapper: Telnyx's STT WebSocket protocol uses query-string configuration,
Authorization: Bearer ..., binary audio frames, and transcript JSON messages such as{ "transcript": "...", "is_final": true, "confidence": 0.98 }.References
plugins/deepgram/src/stt.tsalready implements a WebSocket streaming STT plugin in this repo.Suggested API shape
Either of these would be useful:
or:
The exact API is flexible; the important part is first-class support for Telnyx's WebSocket STT transport.
Acceptance criteria
wss://api.telnyx.com/v2/speech-to-text/transcription.apiKeyfromTELNYX_API_KEYwhen not explicitly provided.transcription_engine,model,language,input_format,sample_rate, andinterim_resultswhere applicable.input_audio_buffer.appendJSON/base64 frames.START_OF_SPEECHINTERIM_TRANSCRIPTFINAL_TRANSCRIPTEND_OF_SPEECHNotes
The Python Telnyx STT plugin looks like a useful behavioral reference: it connects with Bearer auth, passes STT options as WebSocket query parameters, streams audio, and converts Telnyx
{ transcript, is_final, confidence }messages into LiveKit STT events. A JS implementation could reuse the existingstt.STT/stt.SpeechStreampatterns from the Deepgram or AssemblyAI plugins.