WebUI media display enhancements: audio + video#1067
Open
nickdwhite wants to merge 2 commits intoagent0ai:mainfrom
Open
WebUI media display enhancements: audio + video#1067nickdwhite wants to merge 2 commits intoagent0ai:mainfrom
nickdwhite wants to merge 2 commits intoagent0ai:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces inline audio and video playback capabilities to Agent Zero's WebUI chat interface, enabling users to play media files directly within chat messages without downloading or opening external applications.
Problem Statement
Currently, Agent Zero supports file attachments in chat messages, but audio and video files are displayed as plain text links or generic file icons. Users must:
This creates friction in workflows where media content (voice messages, video explanations, audio recordings) is frequently shared between users and agents.
Solution
Implements native HTML5
<audio>and<video>players that render directly in chat messages when media content is detected. The solution includes:Features
Protocol Support
The implementation recognizes custom URL schemes:
audio://path/to/file.mp3- Renders audio playervideo://path/to/file.mp4- Renders video playerfile://andhttp(s)://URLs with media extensions are also detectedImplementation Details
Backend Changes
python/api/media_get.py(New file)GET /api/media/getFrontend Changes
webui/js/messages.jsaudio://->/api/media/get?path=)webui/css/messages.css--color-panel,--color-accent)max-widthconstraintsVisual Preview
Audio Player (Compact)
Video Player (Standard)
Testing
Manual Testing Checklist
Browser Compatibility
Usage Examples
For Developers
Sending audio from an agent:
Sending video:
For Users
Simply attach or reference media files in chat. The UI automatically renders players for:
.mp3,.wav,.ogg,.aac,.flac,.m4a.mp4,.webm,.ogv,.movBackward Compatibility
Yes Fully backward compatible
Configuration
No configuration required. The feature is automatically available when:
Security Considerations
Performance Impact
Future Enhancements
Potential additions for future PRs:
Related Issues
Addresses feature gap: No existing issues specifically for media players, but enhances file attachment functionality referenced in general UI discussions.
Checklist
Files Changed
python/api/media_get.pywebui/js/messages.jswebui/css/messages.cssTotal: ~220 lines of production code, ~130 lines of comments/documentation