Skip to content

Commit a95a327

Browse files
committed
upd
1 parent 4009267 commit a95a327

24 files changed

Lines changed: 15704 additions & 3204 deletions

BUILDPLAN.md

Lines changed: 229 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,229 @@
1+
# OpenCut v0.5.0+ Build Plan
2+
## Swiss Army Knife Expansion
3+
4+
---
5+
6+
## Current State (v0.4.0)
7+
8+
### Backend (server.py - 2195 lines)
9+
- 20+ Flask endpoints: silence, fillers, captions, styled-captions, full pipeline
10+
- Audio Suite endpoints: /audio/denoise, /audio/isolate, /audio/normalize, /audio/measure, /audio/beats, /audio/effects
11+
- Video endpoints: /video/scenes, /video/speed-presets
12+
- System: /system/gpu, SSE streaming, job management, install-whisper
13+
14+
### Core Modules
15+
- `silence.py` - Speech detection, edit summaries
16+
- `fillers.py` - Filler word detection/removal (um, uh, like, etc.)
17+
- `captions.py` - Whisper transcription (3 backends: faster-whisper, openai-whisper, whisperx)
18+
- `styled_captions.py` - 6 caption styles, word-by-word highlight, action word detection, Pillow+FFmpeg rendering
19+
- `zoom.py` - Auto-zoom events based on audio energy
20+
- `audio.py` - PCM/WAV extraction utilities
21+
- `audio_suite.py` - Noise reduction, voice isolation, loudness normalization, beat detection, audio effects (718 lines)
22+
- `scene_detect.py` - Scene boundary detection, chapter marker generation, speed ramp presets (309 lines)
23+
- `diarize.py` - Speaker diarization (pyannote)
24+
25+
### Frontend (CEP Panel)
26+
- 4 tabs: Silence, Fillers, Captions, Full Edit
27+
- Premium dark theme with design tokens
28+
- Auto-import to Premiere Pro, SSE job streaming, port scanning
29+
- 299-line HTML, 1058-line CSS, 843-line JS
30+
31+
### What's Missing
32+
The backend has Audio Suite + Scene Detection + GPU endpoints fully wired, but the **frontend only exposes 4 tabs**. All audio/video/export features have NO GUI.
33+
34+
---
35+
36+
## PHASE 1: Core Editing (v0.4.0) - COMPLETE
37+
38+
Already built:
39+
- Silence detection & removal with presets
40+
- Filler word detection & removal (12 filler types + custom)
41+
- Basic captions (SRT/VTT/JSON export via Whisper)
42+
- Styled caption overlays (6 styles, word-by-word highlight, action words)
43+
- Full pipeline (silence + zoom + captions + fillers combined)
44+
- Auto-import into Premiere Pro
45+
- Job system with SSE streaming and polling fallback
46+
- Inno Setup installer with PyInstaller single-exe
47+
48+
---
49+
50+
## PHASE 2: GUI Overhaul + Advanced Captions (v0.5.0)
51+
52+
**Goal**: Transform the 4-tab panel into a 7-tab professional toolkit and expand captions.
53+
54+
### 2A: GUI Architecture Overhaul
55+
- New horizontal icon tab bar: Cut | Captions | Audio | Video | Export | Settings
56+
- "Cut" tab consolidates: Silence removal, Filler removal, Full pipeline
57+
- Collapsible card sections within each tab
58+
- Smooth tab transitions
59+
- Persistent file selection across all tabs
60+
- Updated version badge and header
61+
62+
### 2B: Advanced Captions
63+
- **9 new caption styles** (15 total): Glow, Karaoke, Outline, Gradient, Typewriter, Bounce, Comic, Subtitle, News Ticker
64+
- **ASS subtitle export** with word-by-word karaoke timing (\kf tags)
65+
- **Custom style editor**: font picker, color pickers (text/highlight/action/stroke), size slider, position control
66+
- **Auto-emoji insertion**: keyword->emoji mapping (laugh->laughing emoji, fire->fire emoji, etc.)
67+
- **Transcript editor panel**: view/edit transcribed text, re-export captions
68+
- **Caption preview**: live preview of style in the panel with sample text
69+
70+
### Files Modified
71+
- `extension/com.opencut.panel/client/index.html` - Complete rewrite (new tab structure)
72+
- `extension/com.opencut.panel/client/style.css` - Expanded for new tabs/components
73+
- `extension/com.opencut.panel/client/main.js` - Complete rewrite (tab routing, new API calls)
74+
- `opencut/core/styled_captions.py` - Add 9 new styles
75+
- `opencut/export/srt.py` - Add ASS export function
76+
- `opencut/server.py` - Add /captions/custom-style and /captions/transcript-edit endpoints
77+
78+
---
79+
80+
## PHASE 3: Audio Suite Panel (v0.6.0)
81+
82+
**Goal**: Wire existing audio backend to the new Audio tab GUI.
83+
84+
### Features
85+
- **Noise Reduction**: Method selector (afftdn/highpass+lowpass), strength slider, one-click denoise
86+
- **Voice Isolation**: Single-button voice emphasis with bandpass filtering
87+
- **Loudness Normalization**: Platform presets (YouTube -14 LUFS, Podcast -16, Broadcast -23, Spotify -14, Apple -16), custom LUFS input, loudness meter display
88+
- **Beat Detection**: BPM display, beat visualization, export beat markers to Premiere timeline
89+
- **Audio Ducking**: VAD-based automatic music ducking during speech
90+
- **Audio Effects**: Reverb, echo, pitch shift, bass boost, treble boost, telephone, radio, slow-mo voice
91+
- **Loudness Meter**: Real-time LUFS readout before/after processing
92+
93+
### Files Modified
94+
- `extension/.../index.html` - Add Audio tab panels
95+
- `extension/.../style.css` - Audio-specific UI components (meters, waveform)
96+
- `extension/.../main.js` - Audio API integration, beat visualization
97+
- `opencut/server.py` - Add /audio/duck endpoint
98+
99+
---
100+
101+
## PHASE 4: Video Intelligence Panel (v0.7.0)
102+
103+
**Goal**: Wire existing video backend + add new features to Video tab GUI.
104+
105+
### Features
106+
- **Scene Detection**: Sensitivity slider, scene list with timestamps, one-click scene markers
107+
- **Chapter Markers**: Auto-generate YouTube chapters from scenes, copy-to-clipboard
108+
- **Speed Ramp Presets**: Ramp In, Ramp Out, Pulse, Heartbeat, Smooth Slow-Mo - applied via FCP XML
109+
- **Custom Speed Curves**: Visual Bezier curve editor for custom velocity profiles
110+
- **Auto-Reframe**: Aspect ratio presets (9:16, 1:1, 4:5), face tracking crop via FFmpeg cropdetect + drawbox
111+
- **LUT Application**: Load .cube/.3dl LUT files, apply color grading via FFmpeg lut3d
112+
113+
### New Backend
114+
- `opencut/core/speed_ramp.py` - Speed curve generation, XML speed keyframes
115+
- `opencut/core/reframe.py` - Aspect ratio conversion with face-tracking crop
116+
- `opencut/server.py` - Add /video/speed-ramp, /video/reframe, /video/lut endpoints
117+
118+
### Files Modified
119+
- `extension/.../index.html` - Add Video tab panels
120+
- `extension/.../style.css` - Speed curve editor, scene list, LUT preview
121+
- `extension/.../main.js` - Video API integration, curve editor widget
122+
123+
---
124+
125+
## PHASE 5: Export & Publish Panel (v0.8.0)
126+
127+
**Goal**: Comprehensive export tools for multi-platform publishing.
128+
129+
### Features
130+
- **Transcript Export**: 6 formats - Plain text, timestamped text, blog post, show notes, YouTube description, social media clips
131+
- **YouTube Chapters**: Scene-based chapter generation with topic labels, copy-to-clipboard
132+
- **Platform Presets**: One-click export profiles for YouTube, TikTok, Instagram Reels, Twitter/X, LinkedIn, Podcast (audio-only)
133+
- **B-Roll Insertion Points**: NLP keyword detection marking where B-roll footage should go
134+
- **Batch Processing**: Queue multiple files for sequential processing
135+
- **Project Templates**: Save/load processing configurations as reusable presets
136+
137+
### New Backend
138+
- `opencut/core/transcript_export.py` - Multi-format transcript generation
139+
- `opencut/core/broll_detect.py` - NLP keyword extraction for B-roll markers
140+
- `opencut/server.py` - Add /export/* endpoints
141+
142+
### Files Modified
143+
- `extension/.../index.html` - Add Export tab panels
144+
- `extension/.../style.css` - Export cards, platform icons, batch queue
145+
- `extension/.../main.js` - Export API integration, clipboard, batch queue
146+
147+
---
148+
149+
## PHASE 6: Settings, Polish & Ship (v1.0.0)
150+
151+
**Goal**: Settings tab, final polish, installer update, documentation.
152+
153+
### Features
154+
- **Settings Tab**: Default model selection, output directory, auto-import toggle, theme customization
155+
- **Model Manager**: View installed models, download/delete on-demand, storage usage display
156+
- **Keyboard Shortcuts**: Expanded shortcut system for power users
157+
- **Onboarding**: First-run welcome screen with feature tour
158+
- **Error Recovery**: Improved error messages, retry buttons, diagnostic info
159+
- **Performance**: Lazy-load tab contents, debounced API calls, memory cleanup
160+
161+
### Build Updates
162+
- Updated Inno Setup installer for v1.0.0
163+
- Updated PyInstaller spec
164+
- Updated README with full feature list
165+
- CHANGELOG.md
166+
- GitHub release automation
167+
168+
### Files Modified
169+
- `extension/.../index.html` - Add Settings tab
170+
- `extension/.../style.css` - Settings components, model manager
171+
- `extension/.../main.js` - Settings persistence, model management
172+
- `build/installer.iss` - Version bump
173+
- `build/opencut.spec` - Include new modules
174+
- `README.md` - Complete rewrite
175+
- `pyproject.toml` - Version bump, new dependencies
176+
177+
---
178+
179+
## Architecture Overview
180+
181+
```
182+
extension/com.opencut.panel/client/
183+
index.html # 7-tab layout: Cut | Captions | Audio | Video | Export | Settings
184+
style.css # Premium dark theme with all component styles
185+
main.js # Tab routing, API integration, UI controllers
186+
187+
opencut/
188+
server.py # Flask server - all endpoints
189+
core/
190+
silence.py # Phase 1 (done)
191+
fillers.py # Phase 1 (done)
192+
captions.py # Phase 1 (done)
193+
styled_captions.py # Phase 1 (done) + Phase 2 (new styles)
194+
zoom.py # Phase 1 (done)
195+
audio.py # Phase 1 (done)
196+
audio_suite.py # Phase 3 (backend done, needs ducking)
197+
scene_detect.py # Phase 4 (backend done)
198+
diarize.py # Phase 1 (done)
199+
speed_ramp.py # Phase 4 (new)
200+
reframe.py # Phase 4 (new)
201+
transcript_export.py # Phase 5 (new)
202+
broll_detect.py # Phase 5 (new)
203+
export/
204+
premiere.py # Phase 1 (done)
205+
srt.py # Phase 1 (done) + Phase 2 (ASS export)
206+
utils/
207+
config.py # Phase 1 (done)
208+
media.py # Phase 1 (done)
209+
```
210+
211+
## Tab → Endpoint Mapping
212+
213+
| Tab | Features | Backend Endpoints |
214+
|-----|----------|-------------------|
215+
| Cut | Silence, Fillers, Full Pipeline | /silence, /fillers, /full |
216+
| Captions | Styled overlay, SRT/VTT/ASS, Custom styles, Transcript editor | /styled-captions, /captions, /caption-styles |
217+
| Audio | Denoise, Isolate, Normalize, Beats, Effects, Ducking | /audio/* |
218+
| Video | Scenes, Chapters, Speed Ramp, Reframe, LUT | /video/* |
219+
| Export | Transcript, YouTube chapters, Platform presets, B-roll, Batch | /export/* |
220+
| Settings | Models, Preferences, Shortcuts, About | /system/*, /health |
221+
222+
## Build Order
223+
224+
Each phase builds incrementally on the previous:
225+
1. Phase 2 creates the new GUI shell that all subsequent phases plug into
226+
2. Phase 3-5 each add one tab's worth of functionality
227+
3. Phase 6 adds settings and polish for v1.0.0 release
228+
229+
Total estimated new/modified code: ~8,000-12,000 lines across all phases.

0 commit comments

Comments
 (0)