Skip to content

Add Telnyx WebSocket STT support for Node agents #1618

@robbedgdev

Description

@robbedgdev

Feature request

Add Telnyx Speech-to-Text support for LiveKit Agents JS, ideally with an ergonomic API such as STT.withTelnyx(...) or a dedicated @livekit/agents-plugin-telnyx STT plugin.

Motivation

agents-js currently has LLM.withTelnyx(...) in the OpenAI-compatible LLM plugin, but there does not appear to be an equivalent STT integration for Telnyx. Telnyx now provides a standalone real-time STT WebSocket endpoint, which is useful for voice agents that already use Telnyx as a provider and want low-latency streaming transcription in Node agents.

This should not be implemented as a plain OpenAI-compatible transcription wrapper: Telnyx's STT WebSocket protocol uses query-string configuration, Authorization: Bearer ..., binary audio frames, and transcript JSON messages such as { "transcript": "...", "is_final": true, "confidence": 0.98 }.

References

Suggested API shape

Either of these would be useful:

import { STT } from '@livekit/agents-plugin-openai';

const stt = STT.withTelnyx({
  apiKey: process.env.TELNYX_API_KEY,
  language: 'en-US',
  transcriptionEngine: 'Telnyx',
  interimResults: true,
});

or:

import * as telnyx from '@livekit/agents-plugin-telnyx';

const stt = new telnyx.STT({
  apiKey: process.env.TELNYX_API_KEY,
  language: 'en-US',
  transcriptionEngine: 'Telnyx',
  interimResults: true,
});

The exact API is flexible; the important part is first-class support for Telnyx's WebSocket STT transport.

Acceptance criteria

  • Supports streaming STT over wss://api.telnyx.com/v2/speech-to-text/transcription.
  • Reads apiKey from TELNYX_API_KEY when not explicitly provided.
  • Supports Telnyx WebSocket parameters such as transcription_engine, model, language, input_format, sample_rate, and interim_results where applicable.
  • Streams audio using the Telnyx wire protocol, i.e. binary audio frames rather than OpenAI realtime input_audio_buffer.append JSON/base64 frames.
  • Maps Telnyx transcript messages into LiveKit STT events:
    • START_OF_SPEECH
    • INTERIM_TRANSCRIPT
    • FINAL_TRANSCRIPT
    • END_OF_SPEECH
  • Handles graceful close/finalization according to Telnyx's engine-specific WebSocket behavior.
  • Includes tests for WebSocket URL construction, auth headers, audio frame forwarding, transcript event mapping, and close/error handling.

Notes

The Python Telnyx STT plugin looks like a useful behavioral reference: it connects with Bearer auth, passes STT options as WebSocket query parameters, streams audio, and converts Telnyx { transcript, is_final, confidence } messages into LiveKit STT events. A JS implementation could reuse the existing stt.STT / stt.SpeechStream patterns from the Deepgram or AssemblyAI plugins.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions