Skip to content

Commit 5d5ca03

Browse files
committed
feat: AI diagram generation in Draw tags + 37 new Playwright tests
- AI-powered diagram generation: natural language → Excalidraw JSON via LLM - New AI prompt section in {{Draw:}} cards with text input, model selector, 🚀 Generate button - EXCALIDRAW_CHEAT_SHEET system prompt for element schema (rect, ellipse, diamond, text, arrow) - repairJson() auto-fixes common LLM JSON mistakes (trailing commas, truncated output) - @model: field in Draw tags for per-card model persistence - Gemini API key forwarding to Excalidraw embed - 22 new draw-docgen tests, 7 readonly-mode tests, 8 excalidraw-library tests - 5 new regression pins (GLM-OCR, Draw card, readonly CSS, embed page)
1 parent 0104878 commit 5d5ca03

12 files changed

Lines changed: 1214 additions & 3 deletions

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@
3636
| **Desktop** | Native app via Neutralino.js with system tray and offline support |
3737
| **Code Execution** | 7 languages in-browser: Bash ([just-bash](https://justbash.dev/)), Math (Nerdamer), LaTeX (MathJax + Nerdamer evaluation), Python ([Pyodide](https://pyodide.org/)), HTML (sandboxed iframe, `html-autorun` for widgets/quizzes), JavaScript (sandboxed iframe), SQL ([sql.js](https://sql.js.org/) SQLite) · 25+ compiled languages via [Judge0 CE](https://ce.judge0.com): C, C++, Rust, Go, Java, TypeScript, Kotlin, Scala, Ruby, Swift, Haskell, Dart, C#, and more · **▶ Run All** notebook engine — one-click sequential execution with preflight dialog (block table with model/status), pre-execution model loading (AI + TTS auto-loaded before blocks run), progress bar, abort, per-block status badges, detailed console logging, and SQLite shared context store |
3838
| **Security** | Content Security Policy (CSP), SRI integrity hashes, XSS sanitization (DOMPurify), ReDoS protection, Firestore write-token ownership, API keys via HTTP headers, postMessage origin validation, 8-char passphrase minimum, sandboxed code execution |
39-
| **AI Document Tags** | `{{@AI:}}` text generation (`@think: Yes` for deep reasoning), `{{@Image:}}` image generation (Gemini Imagen), `{{@OCR:}}` image-to-text extraction (Text/Math/Table modes via Granite Docling 258M, Florence-2 230M, or GLM-OCR 1.5B, PDF page rendering via pdf.js), `{{@TTS:}}` text-to-speech playback (Kokoro TTS per card, language selector, ▶ Play / ⬇ Save WAV), `{{@STT:}}` speech-to-text dictation (engine selector: Whisper/Voxtral/Web Speech API, 11 languages, Record/Stop/Insert/Clear), `{{@Translate:}}` translation (target language selector, integrated TTS pronunciation, cloud model routing), `{{@Game:}}` game builder (AI-generated or pre-built, Canvas 2D/Three.js/P5.js, import/export HTML), `{{@Draw:}}` whiteboard (Excalidraw + Mermaid, Insert/PNG/SVG export, 📚 Library Browser with 29 bundled packs in 6 categories) — `@` prefix syntax on all tag types + metadata fields (`@name`, `@use`, `@think`, `@search`, `@prompt`, `@step`, `@upload`, `@model`, `@engine`, `@lang`, `@prebuilt`); `@model:` field persists selected model per card with intelligent defaults (OCR→`granite-docling`, TTS→`kokoro-tts`, STT→`voxtral-stt`, Image→`imagen-ultra`); editable `@prompt:` textarea and `@step:` inputs in preview cards; description/prompt separation (bare text = label, `@prompt:` = AI instruction); 📎 image/PDF upload for multimodal vision analysis; per-card model selector with document-portable model persistence, concurrent block operations |
39+
| **AI Document Tags** | `{{@AI:}}` text generation (`@think: Yes` for deep reasoning), `{{@Image:}}` image generation (Gemini Imagen), `{{@OCR:}}` image-to-text extraction (Text/Math/Table modes via Granite Docling 258M, Florence-2 230M, or GLM-OCR 1.5B, PDF page rendering via pdf.js), `{{@TTS:}}` text-to-speech playback (Kokoro TTS per card, language selector, ▶ Play / ⬇ Save WAV), `{{@STT:}}` speech-to-text dictation (engine selector: Whisper/Voxtral/Web Speech API, 11 languages, Record/Stop/Insert/Clear), `{{@Translate:}}` translation (target language selector, integrated TTS pronunciation, cloud model routing), `{{@Game:}}` game builder (AI-generated or pre-built, Canvas 2D/Three.js/P5.js, import/export HTML), `{{@Draw:}}` whiteboard (Excalidraw + Mermaid, Insert/PNG/SVG export, 📚 Library Browser with 29 bundled packs in 6 categories, 🚀 AI diagram generation with natural language prompt and model selector) — `@` prefix syntax on all tag types + metadata fields (`@name`, `@use`, `@think`, `@search`, `@prompt`, `@step`, `@upload`, `@model`, `@engine`, `@lang`, `@prebuilt`); `@model:` field persists selected model per card with intelligent defaults (OCR→`granite-docling`, TTS→`kokoro-tts`, STT→`voxtral-stt`, Image→`imagen-ultra`); editable `@prompt:` textarea and `@step:` inputs in preview cards; description/prompt separation (bare text = label, `@prompt:` = AI instruction); 📎 image/PDF upload for multimodal vision analysis; per-card model selector with document-portable model persistence, concurrent block operations |
4040
| **🔌 API Calls** | `{{API:}}` REST API integration — GET/POST/PUT/DELETE methods, custom headers, JSON body, response stored in `$(api_varName)` variables; inline review panel; toolbar GET/POST buttons |
4141
| **🔗 Agent Flow** | `{{Agent:}}` multi-step pipeline — define Step 1/2/3, chain outputs, per-card model + search provider selector, live step status indicators (⏳/✅/❌), review combined output |
4242
| **🔍 Web Search** | Toggle web search for AI — 7 providers: DuckDuckGo (free), Brave Search, Serper.dev, Tavily (AI-optimized), Google CSE, Wikipedia, Wikidata; search results injected into LLM context; source citations in responses; per-agent-card search provider selector |
@@ -48,7 +48,7 @@
4848
| **💾 Disk Workspace** | Folder-backed storage via File System Access API — "Open Folder" in sidebar header; `.md` files read/written directly to disk; `.textagent/workspace.json` manifest; debounced autosave ("💾 Saved to disk" indicator); refresh from disk for external edits; disconnect to revert to localStorage; auto-reconnect on reload via IndexedDB handles; unified action modal for rename/duplicate/delete with confirmation; Chromium-only (hidden in unsupported browsers) |
4949
| **📈 Finance Dashboard** | Stock/crypto/index dashboard templates with live TradingView charts; dynamic grid via `data-var-prefix` (add/remove tickers in `@variables` table, grid auto-adjusts); configurable chart range (`1M`, `12M`, `36M`), interval (`D`, `W`, `M`), and EMA period (default 52); interactive 1M/1Y/3Y range + 52D/52W/52M EMA toggle buttons; `@variables` table persists after ⚡ Vars for re-editing; JS code block generates grid HTML from variables |
5050
| **Extras** | Auto-save (localStorage + cloud), table of contents, image paste, 123+ templates (14 categories: AI, Agents, Coding, Creative, Documentation, Finance, Games, Maths, PPT, Project, Quiz, Skills, Tables, Technical), AI Model Manager template (local model reference with sizes, privacy, and capabilities), template variable substitution (`$(varName)` with auto-detect), table spreadsheet tools (sort, filter, stats, chart, add row/col, inline cell edit, CSV/MD export), content statistics, modular codebase (13+ JS modules), fully responsive mobile UI with scrollable Quick Action Bar (Files, Search, TOC, Share, Copy, Tools, AI, Model, Upload, Help) and formatting toolbar, multi-file workspace sidebar, compact header mode with collapsible Tools dropdown (Presentation, Zen, Word Wrap, Focus, Voice, Dark Mode, Preview Theme), Clear All / Clear Selection buttons (undoable via Ctrl+Z) |
51-
| **Dev Tooling** | ESLint + Prettier (lint, format:check), Playwright test suite — 484 tests across smoke, feature, integration, dev, regression, performance, quality, and security categories (import, export, share, view-mode, editor, email-to-self, secure share, startup timing, export integrity, persistence, module loading, disk workspace, context memory, exec engine, build validation, load-time, accessibility, video player, TTS, STT, file converters, stock widget, embed grid, model registry, model tag, game tag, static analysis, code smell, XSS hardening, Florence-2 model, Docling model, TTS download), pre-commit changelog enforcement, GitHub Actions CI |
51+
| **Dev Tooling** | ESLint + Prettier (lint, format:check), Playwright test suite — 521 tests across smoke, feature, integration, dev, regression, performance, quality, and security categories (import, export, share, view-mode, editor, email-to-self, secure share, startup timing, export integrity, persistence, module loading, disk workspace, context memory, exec engine, build validation, load-time, accessibility, video player, TTS, STT, file converters, stock widget, embed grid, model registry, model tag, game tag, draw docgen, readonly mode, excalidraw library, static analysis, code smell, XSS hardening, Florence-2 model, Docling model, GLM-OCR model, TTS download), pre-commit changelog enforcement, GitHub Actions CI |
5252

5353
## 🤖 AI Assistant
5454

@@ -460,6 +460,7 @@ TextAgent has undergone significant evolution since its inception. What started
460460

461461
| Date | Commits | Feature / Update |
462462
|------|---------|-----------------:|
463+
| **2026-03-18** | | 🚀 **AI Diagram Generation** — natural language → Excalidraw JSON via LLM; new AI prompt section in `{{Draw:}}` cards with text input, model selector dropdown, and 🚀 Generate button; `EXCALIDRAW_CHEAT_SHEET` system prompt teaches LLM the element schema (rectangle, ellipse, diamond, text, arrow, line); `repairJson()` auto-fixes common LLM JSON mistakes (trailing commas, truncated output, missing brackets); `@model:` field in Draw tags for per-card model persistence; cancel/retry support; Gemini API key forwarding to Excalidraw embed; 37 new Playwright tests (22 draw-docgen, 7 readonly-mode, 8 excalidraw-library) + 5 regression pins |
463464
| **2026-03-18** | | 📷 **GLM-OCR Model** — added [GLM-OCR (1.5B)](https://huggingface.co/textagent/GLM-OCR-ONNX) as third local OCR model alongside Granite Docling and Florence-2; `ai-worker-glm-ocr.js` Web Worker using q4f16 quantization (~650 MB, WebGPU required); primary `textagent/GLM-OCR-ONNX` with `onnx-community/GLM-OCR-ONNX` fallback; `glm-ocr` entry in `ai-models.js` with `isDocModel: true`; documentation updated; 7 new Playwright model registry tests |
464465
| **2026-03-18** | | 📚 **Excalidraw Library Browser** — 29 bundled library packs (600+ items) organized in 6 categories (Architecture & System Design, UI/UX & Wireframing, Icons & Logos, Cloud & DevOps, Data & Algorithms, AI/Science & Education) with slide-in Library Browser panel; each library card with name, description, and toggle switch for on-demand loading; real-time search/filter; injected via MutationObserver into Excalidraw's native Library sidebar as "📦 Browse & Add Library Packs" button; libraries include Software Architecture, System Design Components, AWS Icons, Google Icons (139 items), UML/ER, Wireframing, Deep Learning, Math Teacher, Charts, Graphs, and more |
465466
| **2026-03-18** | | 🎨 **Draw DocGen Integration** — full `{{Draw:}}` tag pipeline: `transformDrawMarkdown` + `bindDrawPreviewActions` in renderer, 🎨 Draw toolbar button, `excalidraw.com` added to CSP `frame-src`, `draw-docgen.css` (309-line standalone stylesheet with card UI, tool pills, Mermaid editor, dark mode), `draw-docgen.js` lazy-loaded as Phase 3j; DOMPurify allowlist expanded with `data-draw-index`, `data-draw-tool`, `data-tool`, `data-skill` |
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# CHANGELOG — AI Diagram Generation + Test Coverage
2+
3+
## AI Diagram Generation (Draw DocGen)
4+
- Added AI-powered diagram generation to `{{Draw:}}` cards — users can describe a diagram in natural language and generate Excalidraw JSON via LLM
5+
- New `.draw-ai-prompt-section` with text input, model selector dropdown, and 🚀 Generate button
6+
- `js/draw-docgen.js`: ~400 lines of new logic including `repairJson()` for LLM output cleanup, `@model:` field parsing, Excalidraw element rendering pipeline, and cancel/retry support
7+
- `css/draw-docgen.css`: 133 lines of new CSS for AI prompt row, generate button, model selector, status bar, and dark mode
8+
- `public/ai-worker-common.js`: added `excalidraw_diagram: 16384` token limit and `EXCALIDRAW_CHEAT_SHEET` const (Excalidraw element schema reference for LLM context)
9+
- `public/ai-worker-gemini.js` and `public/ai-worker.js`: registered `excalidraw_diagram` task type
10+
- `public/excalidraw-embed.html`: added `set-api-key` postMessage handler for forwarding Gemini key to embed
11+
- `styles.css`: minor CSS import fix
12+
13+
## Comprehensive Test Coverage
14+
- **NEW** `tests/feature/draw-docgen.spec.js` — 22 tests covering module loading, tag parsing, card rendering, tool pills, Mermaid editor, AI prompt section, AI Generate button, model selector, toolbar integration, and error safety
15+
- **NEW** `tests/feature/readonly-mode.spec.js` — 7 tests covering CSS lockdown enforcement, JS guards (insertAtCursor, paste, keyboard), and opacity verification
16+
- **NEW** `tests/feature/excalidraw-library.spec.js` — 8 tests covering asset serving, JSON validity, embed page structure, and 29+ library file count
17+
- **UPDATED** `tests/regression/regression-recent.spec.js` — 5 new regression pins: GLM-OCR model entry, Draw tag card rendering, readonly CSS opacity lockdown, Excalidraw embed page accessibility
18+
19+
## Files Modified
20+
- `js/draw-docgen.js` (+397 lines)
21+
- `css/draw-docgen.css` (+133 lines)
22+
- `public/ai-worker-common.js` (+51 lines)
23+
- `public/ai-worker-gemini.js` (+2 lines)
24+
- `public/ai-worker.js` (+2 lines)
25+
- `public/excalidraw-embed.html` (+6 lines)
26+
- `styles.css` (+1 line)
27+
- `tests/regression/regression-recent.spec.js` (+61 lines)
28+
- `tests/feature/draw-docgen.spec.js` (NEW, 271 lines)
29+
- `tests/feature/readonly-mode.spec.js` (NEW, 160 lines)
30+
- `tests/feature/excalidraw-library.spec.js` (NEW, 115 lines)
31+
32+
## Test Results
33+
All 52 tests pass (44s, Chromium via Playwright)

css/draw-docgen.css

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,107 @@
137137
font-weight: 600;
138138
}
139139

140+
/* ─── AI Generate Button & Prompt ─── */
141+
142+
/* Model selector (same pattern as git-docgen) */
143+
.draw-ai-model-select {
144+
border: 1px solid rgba(105, 101, 219, 0.3);
145+
border-radius: 6px;
146+
padding: 3px 6px;
147+
font-size: 0.75rem;
148+
background: transparent;
149+
color: inherit;
150+
max-width: 140px;
151+
cursor: pointer;
152+
}
153+
154+
.draw-ai-prompt-section {
155+
padding: 10px 14px;
156+
background: rgba(105, 101, 219, 0.04);
157+
border-bottom: 1px solid rgba(105, 101, 219, 0.1);
158+
}
159+
160+
.draw-ai-prompt-row {
161+
display: flex;
162+
gap: 6px;
163+
align-items: center;
164+
}
165+
166+
.draw-ai-prompt-input {
167+
flex: 1;
168+
border: 1px solid rgba(105, 101, 219, 0.25);
169+
border-radius: 8px;
170+
padding: 7px 12px;
171+
font-size: 0.82rem;
172+
background: rgba(255, 255, 255, 0.6);
173+
color: inherit;
174+
transition: border-color 0.2s;
175+
font-family: inherit;
176+
}
177+
178+
.draw-ai-prompt-input:focus {
179+
outline: none;
180+
border-color: #6965db;
181+
box-shadow: 0 0 0 3px rgba(105, 101, 219, 0.12);
182+
}
183+
184+
.draw-ai-prompt-input::placeholder {
185+
color: #9ca3af;
186+
font-size: 0.78rem;
187+
}
188+
189+
.draw-ai-generate-btn {
190+
background: linear-gradient(135deg, #6965db 0%, #a855f7 100%) !important;
191+
color: #fff !important;
192+
font-weight: 600;
193+
min-width: 100px;
194+
}
195+
196+
.draw-ai-generate-btn:hover {
197+
background: linear-gradient(135deg, #5753c9 0%, #9333ea 100%) !important;
198+
}
199+
200+
.draw-ai-generate-btn:disabled {
201+
opacity: 0.6;
202+
cursor: not-allowed;
203+
}
204+
205+
.draw-ai-cancel-btn {
206+
padding: 4px 8px !important;
207+
font-size: 0.85rem !important;
208+
background: transparent !important;
209+
color: #9ca3af !important;
210+
}
211+
212+
.draw-ai-cancel-btn:hover {
213+
color: #ef4444 !important;
214+
}
215+
216+
.draw-ai-status {
217+
margin-top: 8px;
218+
font-size: 0.78rem;
219+
color: #6965db;
220+
padding: 6px 10px;
221+
border-radius: 6px;
222+
background: rgba(105, 101, 219, 0.06);
223+
}
224+
225+
.draw-ai-spinner {
226+
display: inline-block;
227+
width: 14px;
228+
height: 14px;
229+
border: 2px solid rgba(105, 101, 219, 0.2);
230+
border-top-color: #6965db;
231+
border-radius: 50%;
232+
animation: draw-ai-spin 0.8s linear infinite;
233+
vertical-align: middle;
234+
margin-right: 6px;
235+
}
236+
237+
@keyframes draw-ai-spin {
238+
to { transform: rotate(360deg); }
239+
}
240+
140241
/* ─── Dark Mode ─── */
141242

142243
[data-theme="dark"] .draw-docgen-card,
@@ -184,6 +285,38 @@
184285
color: #34d399;
185286
}
186287

288+
/* ─── Dark Mode: AI Prompt ─── */
289+
290+
[data-theme="dark"] .draw-ai-prompt-section,
291+
.dark-mode .draw-ai-prompt-section {
292+
background: rgba(105, 101, 219, 0.06);
293+
border-bottom-color: rgba(105, 101, 219, 0.15);
294+
}
295+
296+
[data-theme="dark"] .draw-ai-prompt-input,
297+
.dark-mode .draw-ai-prompt-input {
298+
background: rgba(0, 0, 0, 0.3);
299+
border-color: rgba(165, 162, 241, 0.3);
300+
color: #e5e7eb;
301+
}
302+
303+
[data-theme="dark"] .draw-ai-prompt-input::placeholder,
304+
.dark-mode .draw-ai-prompt-input::placeholder {
305+
color: #6b7280;
306+
}
307+
308+
[data-theme="dark"] .draw-ai-status,
309+
.dark-mode .draw-ai-status {
310+
color: #a5a2f1;
311+
background: rgba(105, 101, 219, 0.1);
312+
}
313+
314+
[data-theme="dark"] .draw-ai-spinner,
315+
.dark-mode .draw-ai-spinner {
316+
border-color: rgba(165, 162, 241, 0.2);
317+
border-top-color: #a5a2f1;
318+
}
319+
187320
/* ─── Tool Pills (Excalidraw / Mermaid) ─── */
188321

189322
.draw-tool-pills {

0 commit comments

Comments
 (0)