RTP Opus Streamer

Real-time audio streaming system demonstrating production-grade network protocols, audio processing, and observability. Built with Rust for performance and safety.

Overview

This project implements a complete RTP/Opus streaming pipeline with:

Network Resilience: Jitter buffering, packet reordering, loss concealment
Audio Analysis: Real-time FFT-based spectral analysis (Phase 4)
Observability: Prometheus metrics, structured logging, performance profiling
Production Quality: Comprehensive testing, CI/CD, RFC compliance

Current Configuration: Voice-optimized (16kHz, 24 kbps). Music content will sound degraded. See samples/README.md for details.

Target Use Cases: VoIP systems, live streaming, real-time communication platforms

Documentation:

📋 Design Document - Architecture, performance analysis, design decisions
📊 Performance Report - Real-world metrics and validation

Project Status: Phase 4 complete (Audio Analysis & ML Integration) Development Roadmap: See Project Plan

Architecture

┌─────────────────────────────────────────┐
│         Audio Source                    │
│       (WAV file / device)               │
└──────────────┬──────────────────────────┘
               │ 20ms PCM frames
               ↓
┌──────────────┴──────────────────────────┐
│         Opus Encoder                    │
│    (24 kbps, voice-optimized)           │
└──────────────┬──────────────────────────┘
               │ Compressed frames
               ↓
┌──────────────┴──────────────────────────┐
│         RTP Packetizer                  │
│       (RFC 3550, seq#, ts)              │
└──────────────┬──────────────────────────┘
               │ RTP packets
               ↓
         [ UDP Socket ]
               │
               ↓
┌──────────────┴──────────────────────────┐
│         RTP Receiver                    │
│    (validate, extract payload)          │
└──────────────┬──────────────────────────┘
               │ Opus frames
               ↓
┌──────────────┴──────────────────────────┐
│         Jitter Buffer                   │
│   (reorder, loss detect, delay)         │
└──────────────┬──────────────────────────┘
               │ Ordered frames
               ↓
┌──────────────┴──────────────────────────┐
│         Opus Decoder                    │
│         (to PCM)                        │
└──────────────┬──────────────────────────┘
               │ PCM samples
               ↓
┌──────────────┴──────────────────────────┐
│         Audio Sink                      │
│       (playback device)                 │
└─────────────────────────────────────────┘

Implementation Phases

Phase 1: Core Pipeline (Week 1) - File → RTP → Playback ✅
- Audio file reader, Opus encode/decode, RTP packetization, UDP transport, playback
Phase 2: Network Resilience (Week 2) - Robust packet handling ✅
- Jitter buffer (60ms configurable), packet reordering, loss detection, statistics tracking, PLC
Phase 3: Observability (Week 3) - Metrics and measurement ✅
- Prometheus-based metrics, latency measurement, and system observability
Phase 4: Audio Analysis & ML Integration (Week 4+) - Audio intelligence ✅
- Real-time FFT spectral analysis, dominant frequency extraction, spectral energy/centroid (use --analyze flag)

Building

Prerequisites:

Linux (Ubuntu/Debian):

sudo apt-get install libopus-dev libasound2-dev

Linux (Fedora/RHEL):

sudo dnf install opus-devel alsa-lib-devel

macOS:

brew install opus

Windows:

Install Opus via vcpkg or download pre-built binaries
WASAPI used for audio (no additional dependencies)

Build:

cargo build --release

Running

Basic Usage

Terminal 1 - Start Receiver:

./target/release/receiver --port 5004

# With custom jitter buffer depth (default: 60ms)
./target/release/receiver --port 5004 --buffer-depth-ms 100

# With real-time audio analysis (Phase 4)
./target/release/receiver --port 5004 --analyze

Terminal 2 - Send Audio:

./target/release/sender --input samples/voice.wav --remote 127.0.0.1:5004

Testing with Generated Audio

You can create a test WAV file using various tools:

# Using sox (if installed)
sox -n -r 16000 -c 1 test.wav synth 5 sine 440

# Using ffmpeg (if installed)
ffmpeg -f lavfi -i "sine=frequency=440:duration=5:sample_rate=16000" -ac 1 test.wav

Command Line Options

Sender:

sender --input <file.wav> --remote <ip:port> [--interval-ms <ms>]

--input: Path to WAV file (any sample rate, mono or stereo). Currently optimized for voice (see samples/README.md for details).
--remote: Destination IP:port (default: 127.0.0.1:5004)
--interval-ms: Packet send interval in ms (default: 20ms for real-time)

Receiver:

receiver --port <port> [--buffer-depth-ms <ms>] [--analyze]

--port: UDP port to listen on (default: 5004)
--buffer-depth-ms: Jitter buffer depth in milliseconds (default: 60ms)
--analyze: Enable real-time spectral analysis output (Phase 4)

Example: Local Loopback Test

# Terminal 1
cargo run --bin receiver --release

# Terminal 2
cargo run --bin sender --release -- --input samples/voice.wav

# With audio analysis
cargo run --bin receiver --release -- --analyze

Testing

# Unit tests
cargo test

# Integration tests (requires audio fixtures)
cargo test --test integration

# Benchmarks
cargo bench

Key Design Choices (Summary)

Frame Size: 20ms Opus supports 2.5, 5, 10, 20, 40, 60ms frames. Using 20ms balances:

Latency: Lower frame size reduces algorithmic delay
Efficiency: Higher frame size improves compression
Network: 20ms = 50 packets/sec, manageable overhead

Jitter Buffer: 60ms Typical networks show 10-30ms jitter. 60ms buffer provides:

Headroom for variance (2-3σ coverage)
Acceptable added latency
Reordering window for out-of-sequence packets

Codec Configuration: Voice-Optimized Current settings (16kHz, 24 kbps, VOIP mode) prioritize bandwidth efficiency for speech. Music content will sound degraded. Future work will add configurable codec modes.

See docs/design.md for full analysis.

Performance Targets

Metric	Target
Glass-to-glass	< 150ms (p50)
CPU per stream	< 2%
Packet loss @ 5%	Imperceptible
Max concurrent	50+ streams

Extending

This is a reference implementation. Production deployments should consider:

SRTP for encryption
DTLS key exchange
ICE/STUN/TURN for NAT traversal
Scalability (multicast, forwarding servers)

References

RFC 3550: RTP (Real-time Transport Protocol)
RFC 6716: Opus Audio Codec
RFC 3551: RTP Profile for Audio/Video

License

MIT OR Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
common		common
docs		docs
receiver		receiver
samples		samples
scripts		scripts
sender		sender
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RTP Opus Streamer

Overview

Architecture

Implementation Phases

Building

Running

Basic Usage

Testing with Generated Audio

Command Line Options

Example: Local Loopback Test

Testing

Key Design Choices (Summary)

Performance Targets

Extending

References

License

About

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RTP Opus Streamer

Overview

Architecture

Implementation Phases

Building

Running

Basic Usage

Testing with Generated Audio

Command Line Options

Example: Local Loopback Test

Testing

Key Design Choices (Summary)

Performance Targets

Extending

References

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages