|
| 1 | +# OpenCut v0.5.0+ Build Plan |
| 2 | +## Swiss Army Knife Expansion |
| 3 | + |
| 4 | +--- |
| 5 | + |
| 6 | +## Current State (v0.4.0) |
| 7 | + |
| 8 | +### Backend (server.py - 2195 lines) |
| 9 | +- 20+ Flask endpoints: silence, fillers, captions, styled-captions, full pipeline |
| 10 | +- Audio Suite endpoints: /audio/denoise, /audio/isolate, /audio/normalize, /audio/measure, /audio/beats, /audio/effects |
| 11 | +- Video endpoints: /video/scenes, /video/speed-presets |
| 12 | +- System: /system/gpu, SSE streaming, job management, install-whisper |
| 13 | + |
| 14 | +### Core Modules |
| 15 | +- `silence.py` - Speech detection, edit summaries |
| 16 | +- `fillers.py` - Filler word detection/removal (um, uh, like, etc.) |
| 17 | +- `captions.py` - Whisper transcription (3 backends: faster-whisper, openai-whisper, whisperx) |
| 18 | +- `styled_captions.py` - 6 caption styles, word-by-word highlight, action word detection, Pillow+FFmpeg rendering |
| 19 | +- `zoom.py` - Auto-zoom events based on audio energy |
| 20 | +- `audio.py` - PCM/WAV extraction utilities |
| 21 | +- `audio_suite.py` - Noise reduction, voice isolation, loudness normalization, beat detection, audio effects (718 lines) |
| 22 | +- `scene_detect.py` - Scene boundary detection, chapter marker generation, speed ramp presets (309 lines) |
| 23 | +- `diarize.py` - Speaker diarization (pyannote) |
| 24 | + |
| 25 | +### Frontend (CEP Panel) |
| 26 | +- 4 tabs: Silence, Fillers, Captions, Full Edit |
| 27 | +- Premium dark theme with design tokens |
| 28 | +- Auto-import to Premiere Pro, SSE job streaming, port scanning |
| 29 | +- 299-line HTML, 1058-line CSS, 843-line JS |
| 30 | + |
| 31 | +### What's Missing |
| 32 | +The backend has Audio Suite + Scene Detection + GPU endpoints fully wired, but the **frontend only exposes 4 tabs**. All audio/video/export features have NO GUI. |
| 33 | + |
| 34 | +--- |
| 35 | + |
| 36 | +## PHASE 1: Core Editing (v0.4.0) - COMPLETE |
| 37 | + |
| 38 | +Already built: |
| 39 | +- Silence detection & removal with presets |
| 40 | +- Filler word detection & removal (12 filler types + custom) |
| 41 | +- Basic captions (SRT/VTT/JSON export via Whisper) |
| 42 | +- Styled caption overlays (6 styles, word-by-word highlight, action words) |
| 43 | +- Full pipeline (silence + zoom + captions + fillers combined) |
| 44 | +- Auto-import into Premiere Pro |
| 45 | +- Job system with SSE streaming and polling fallback |
| 46 | +- Inno Setup installer with PyInstaller single-exe |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +## PHASE 2: GUI Overhaul + Advanced Captions (v0.5.0) |
| 51 | + |
| 52 | +**Goal**: Transform the 4-tab panel into a 7-tab professional toolkit and expand captions. |
| 53 | + |
| 54 | +### 2A: GUI Architecture Overhaul |
| 55 | +- New horizontal icon tab bar: Cut | Captions | Audio | Video | Export | Settings |
| 56 | +- "Cut" tab consolidates: Silence removal, Filler removal, Full pipeline |
| 57 | +- Collapsible card sections within each tab |
| 58 | +- Smooth tab transitions |
| 59 | +- Persistent file selection across all tabs |
| 60 | +- Updated version badge and header |
| 61 | + |
| 62 | +### 2B: Advanced Captions |
| 63 | +- **9 new caption styles** (15 total): Glow, Karaoke, Outline, Gradient, Typewriter, Bounce, Comic, Subtitle, News Ticker |
| 64 | +- **ASS subtitle export** with word-by-word karaoke timing (\kf tags) |
| 65 | +- **Custom style editor**: font picker, color pickers (text/highlight/action/stroke), size slider, position control |
| 66 | +- **Auto-emoji insertion**: keyword->emoji mapping (laugh->laughing emoji, fire->fire emoji, etc.) |
| 67 | +- **Transcript editor panel**: view/edit transcribed text, re-export captions |
| 68 | +- **Caption preview**: live preview of style in the panel with sample text |
| 69 | + |
| 70 | +### Files Modified |
| 71 | +- `extension/com.opencut.panel/client/index.html` - Complete rewrite (new tab structure) |
| 72 | +- `extension/com.opencut.panel/client/style.css` - Expanded for new tabs/components |
| 73 | +- `extension/com.opencut.panel/client/main.js` - Complete rewrite (tab routing, new API calls) |
| 74 | +- `opencut/core/styled_captions.py` - Add 9 new styles |
| 75 | +- `opencut/export/srt.py` - Add ASS export function |
| 76 | +- `opencut/server.py` - Add /captions/custom-style and /captions/transcript-edit endpoints |
| 77 | + |
| 78 | +--- |
| 79 | + |
| 80 | +## PHASE 3: Audio Suite Panel (v0.6.0) |
| 81 | + |
| 82 | +**Goal**: Wire existing audio backend to the new Audio tab GUI. |
| 83 | + |
| 84 | +### Features |
| 85 | +- **Noise Reduction**: Method selector (afftdn/highpass+lowpass), strength slider, one-click denoise |
| 86 | +- **Voice Isolation**: Single-button voice emphasis with bandpass filtering |
| 87 | +- **Loudness Normalization**: Platform presets (YouTube -14 LUFS, Podcast -16, Broadcast -23, Spotify -14, Apple -16), custom LUFS input, loudness meter display |
| 88 | +- **Beat Detection**: BPM display, beat visualization, export beat markers to Premiere timeline |
| 89 | +- **Audio Ducking**: VAD-based automatic music ducking during speech |
| 90 | +- **Audio Effects**: Reverb, echo, pitch shift, bass boost, treble boost, telephone, radio, slow-mo voice |
| 91 | +- **Loudness Meter**: Real-time LUFS readout before/after processing |
| 92 | + |
| 93 | +### Files Modified |
| 94 | +- `extension/.../index.html` - Add Audio tab panels |
| 95 | +- `extension/.../style.css` - Audio-specific UI components (meters, waveform) |
| 96 | +- `extension/.../main.js` - Audio API integration, beat visualization |
| 97 | +- `opencut/server.py` - Add /audio/duck endpoint |
| 98 | + |
| 99 | +--- |
| 100 | + |
| 101 | +## PHASE 4: Video Intelligence Panel (v0.7.0) |
| 102 | + |
| 103 | +**Goal**: Wire existing video backend + add new features to Video tab GUI. |
| 104 | + |
| 105 | +### Features |
| 106 | +- **Scene Detection**: Sensitivity slider, scene list with timestamps, one-click scene markers |
| 107 | +- **Chapter Markers**: Auto-generate YouTube chapters from scenes, copy-to-clipboard |
| 108 | +- **Speed Ramp Presets**: Ramp In, Ramp Out, Pulse, Heartbeat, Smooth Slow-Mo - applied via FCP XML |
| 109 | +- **Custom Speed Curves**: Visual Bezier curve editor for custom velocity profiles |
| 110 | +- **Auto-Reframe**: Aspect ratio presets (9:16, 1:1, 4:5), face tracking crop via FFmpeg cropdetect + drawbox |
| 111 | +- **LUT Application**: Load .cube/.3dl LUT files, apply color grading via FFmpeg lut3d |
| 112 | + |
| 113 | +### New Backend |
| 114 | +- `opencut/core/speed_ramp.py` - Speed curve generation, XML speed keyframes |
| 115 | +- `opencut/core/reframe.py` - Aspect ratio conversion with face-tracking crop |
| 116 | +- `opencut/server.py` - Add /video/speed-ramp, /video/reframe, /video/lut endpoints |
| 117 | + |
| 118 | +### Files Modified |
| 119 | +- `extension/.../index.html` - Add Video tab panels |
| 120 | +- `extension/.../style.css` - Speed curve editor, scene list, LUT preview |
| 121 | +- `extension/.../main.js` - Video API integration, curve editor widget |
| 122 | + |
| 123 | +--- |
| 124 | + |
| 125 | +## PHASE 5: Export & Publish Panel (v0.8.0) |
| 126 | + |
| 127 | +**Goal**: Comprehensive export tools for multi-platform publishing. |
| 128 | + |
| 129 | +### Features |
| 130 | +- **Transcript Export**: 6 formats - Plain text, timestamped text, blog post, show notes, YouTube description, social media clips |
| 131 | +- **YouTube Chapters**: Scene-based chapter generation with topic labels, copy-to-clipboard |
| 132 | +- **Platform Presets**: One-click export profiles for YouTube, TikTok, Instagram Reels, Twitter/X, LinkedIn, Podcast (audio-only) |
| 133 | +- **B-Roll Insertion Points**: NLP keyword detection marking where B-roll footage should go |
| 134 | +- **Batch Processing**: Queue multiple files for sequential processing |
| 135 | +- **Project Templates**: Save/load processing configurations as reusable presets |
| 136 | + |
| 137 | +### New Backend |
| 138 | +- `opencut/core/transcript_export.py` - Multi-format transcript generation |
| 139 | +- `opencut/core/broll_detect.py` - NLP keyword extraction for B-roll markers |
| 140 | +- `opencut/server.py` - Add /export/* endpoints |
| 141 | + |
| 142 | +### Files Modified |
| 143 | +- `extension/.../index.html` - Add Export tab panels |
| 144 | +- `extension/.../style.css` - Export cards, platform icons, batch queue |
| 145 | +- `extension/.../main.js` - Export API integration, clipboard, batch queue |
| 146 | + |
| 147 | +--- |
| 148 | + |
| 149 | +## PHASE 6: Settings, Polish & Ship (v1.0.0) |
| 150 | + |
| 151 | +**Goal**: Settings tab, final polish, installer update, documentation. |
| 152 | + |
| 153 | +### Features |
| 154 | +- **Settings Tab**: Default model selection, output directory, auto-import toggle, theme customization |
| 155 | +- **Model Manager**: View installed models, download/delete on-demand, storage usage display |
| 156 | +- **Keyboard Shortcuts**: Expanded shortcut system for power users |
| 157 | +- **Onboarding**: First-run welcome screen with feature tour |
| 158 | +- **Error Recovery**: Improved error messages, retry buttons, diagnostic info |
| 159 | +- **Performance**: Lazy-load tab contents, debounced API calls, memory cleanup |
| 160 | + |
| 161 | +### Build Updates |
| 162 | +- Updated Inno Setup installer for v1.0.0 |
| 163 | +- Updated PyInstaller spec |
| 164 | +- Updated README with full feature list |
| 165 | +- CHANGELOG.md |
| 166 | +- GitHub release automation |
| 167 | + |
| 168 | +### Files Modified |
| 169 | +- `extension/.../index.html` - Add Settings tab |
| 170 | +- `extension/.../style.css` - Settings components, model manager |
| 171 | +- `extension/.../main.js` - Settings persistence, model management |
| 172 | +- `build/installer.iss` - Version bump |
| 173 | +- `build/opencut.spec` - Include new modules |
| 174 | +- `README.md` - Complete rewrite |
| 175 | +- `pyproject.toml` - Version bump, new dependencies |
| 176 | + |
| 177 | +--- |
| 178 | + |
| 179 | +## Architecture Overview |
| 180 | + |
| 181 | +``` |
| 182 | +extension/com.opencut.panel/client/ |
| 183 | + index.html # 7-tab layout: Cut | Captions | Audio | Video | Export | Settings |
| 184 | + style.css # Premium dark theme with all component styles |
| 185 | + main.js # Tab routing, API integration, UI controllers |
| 186 | +
|
| 187 | +opencut/ |
| 188 | + server.py # Flask server - all endpoints |
| 189 | + core/ |
| 190 | + silence.py # Phase 1 (done) |
| 191 | + fillers.py # Phase 1 (done) |
| 192 | + captions.py # Phase 1 (done) |
| 193 | + styled_captions.py # Phase 1 (done) + Phase 2 (new styles) |
| 194 | + zoom.py # Phase 1 (done) |
| 195 | + audio.py # Phase 1 (done) |
| 196 | + audio_suite.py # Phase 3 (backend done, needs ducking) |
| 197 | + scene_detect.py # Phase 4 (backend done) |
| 198 | + diarize.py # Phase 1 (done) |
| 199 | + speed_ramp.py # Phase 4 (new) |
| 200 | + reframe.py # Phase 4 (new) |
| 201 | + transcript_export.py # Phase 5 (new) |
| 202 | + broll_detect.py # Phase 5 (new) |
| 203 | + export/ |
| 204 | + premiere.py # Phase 1 (done) |
| 205 | + srt.py # Phase 1 (done) + Phase 2 (ASS export) |
| 206 | + utils/ |
| 207 | + config.py # Phase 1 (done) |
| 208 | + media.py # Phase 1 (done) |
| 209 | +``` |
| 210 | + |
| 211 | +## Tab → Endpoint Mapping |
| 212 | + |
| 213 | +| Tab | Features | Backend Endpoints | |
| 214 | +|-----|----------|-------------------| |
| 215 | +| Cut | Silence, Fillers, Full Pipeline | /silence, /fillers, /full | |
| 216 | +| Captions | Styled overlay, SRT/VTT/ASS, Custom styles, Transcript editor | /styled-captions, /captions, /caption-styles | |
| 217 | +| Audio | Denoise, Isolate, Normalize, Beats, Effects, Ducking | /audio/* | |
| 218 | +| Video | Scenes, Chapters, Speed Ramp, Reframe, LUT | /video/* | |
| 219 | +| Export | Transcript, YouTube chapters, Platform presets, B-roll, Batch | /export/* | |
| 220 | +| Settings | Models, Preferences, Shortcuts, About | /system/*, /health | |
| 221 | + |
| 222 | +## Build Order |
| 223 | + |
| 224 | +Each phase builds incrementally on the previous: |
| 225 | +1. Phase 2 creates the new GUI shell that all subsequent phases plug into |
| 226 | +2. Phase 3-5 each add one tab's worth of functionality |
| 227 | +3. Phase 6 adds settings and polish for v1.0.0 release |
| 228 | + |
| 229 | +Total estimated new/modified code: ~8,000-12,000 lines across all phases. |
0 commit comments