|
28 | 28 | | **🎬 Media Embedding** | Video playback via `` image syntax (`.mp4`, `.webm`, `.ogg`, `.mov`, `.m4v`); YouTube/Vimeo embeds auto-detected; `embed` code block for responsive media grids (`cols=1-4`, `height=N`); Video.js v10 lazy-loaded with native `<video>` fallback; website URLs render as rich link preview cards with favicon + "Open ↗" button | |
29 | 29 | | **🤖 AI Assistant** | 3 local Qwen 3.5 sizes (0.8B / 2B / 4B via WebGPU/WASM), Gemini 3.1 Flash Lite, Groq Llama 3.3 70B, OpenRouter — summarize, expand, rephrase, grammar-fix, explain, simplify, auto-complete; AI writing tags (Polish, Formalize, Elaborate, Shorten, Image); enhanced context menu; per-card model selection; concurrent block generation; inline review with accept/reject/regenerate; AI-powered image generation; **smart model loading UX** — cache vs download detection (📦/⬇️), HuggingFace source location display, delete cached models from browser storage; all models hosted on [`textagent` HuggingFace org](https://huggingface.co/textagent) with automatic fallback | |
30 | 30 | | **🎤 Voice Dictation** | Dual-engine speech-to-text: **Voxtral Mini 3B** (WebGPU, primary, 13 languages, ~2.7 GB) or **Whisper Large V3 Turbo** (WASM fallback, ~800 MB) with consensus scoring; download consent popup with model info before first use; 50+ Markdown-aware voice commands — natural phrases ("heading one", "bold…end bold", "add table", "undo"); auto-punctuation via AI refinement or built-in fallback; streaming partial results | |
31 | | -| **🔊 Text-to-Speech** | Hybrid Kokoro TTS engine — English/Chinese via [Kokoro 82M v1.1-zh ONNX](https://huggingface.co/textagent/Kokoro-82M-v1.1-zh-ONNX) (~80 MB, off-thread WebWorker), Japanese & 10+ languages via Web Speech API fallback; hover any preview text and click 🔊 to hear pronunciation; voice auto-selection by language; ⬇ Save button to download generated audio as WAV file | |
| 31 | +| **🔊 Text-to-Speech** | Hybrid Kokoro TTS engine — English/Chinese via [Kokoro 82M v1.1-zh ONNX](https://huggingface.co/textagent/Kokoro-82M-v1.1-zh-ONNX) (~80 MB, off-thread WebWorker), Japanese & 10+ languages via Web Speech API fallback; TTS card with separate ▶ Run (generate audio) / ▷ Play (replay) / 💾 Save (WAV download) buttons; hover any preview text and click 🔊 to hear pronunciation; voice auto-selection by language | |
32 | 32 | | **Import** | MD, DOCX, XLSX/XLS, CSV, HTML, JSON, XML, PDF — drag & drop or click to import | |
33 | 33 | | **Export** | Markdown, self-contained styled HTML, PDF (smart page-breaks, shared rendering pipeline), LLM Memory (5 formats: XML, JSON, Compact JSON, Markdown, Plain Text + shareable link) | |
34 | 34 | | **Sharing** | AES-256-GCM encrypted sharing via Firebase; read-only shared links, optional passphrase protection — decryption key stays in URL fragment (never sent to server) | |
35 | 35 | | **Presentation** | Slide mode using `---` separators, keyboard navigation, multiple layouts & transitions, speaker notes, overview grid, 20+ PPT templates with image backgrounds | |
36 | 36 | | **Desktop** | Native app via Neutralino.js with system tray and offline support | |
37 | | -| **Code Execution** | 7 languages in-browser: Bash ([just-bash](https://justbash.dev/)), Math (Nerdamer), LaTeX (MathJax + Nerdamer evaluation), Python ([Pyodide](https://pyodide.org/)), HTML (sandboxed iframe, `html-autorun` for widgets/quizzes), JavaScript (sandboxed iframe), SQL ([sql.js](https://sql.js.org/) SQLite) · 25+ compiled languages via [Judge0 CE](https://ce.judge0.com): C, C++, Rust, Go, Java, TypeScript, Kotlin, Scala, Ruby, Swift, Haskell, Dart, C#, and more · **▶ Run All** notebook engine — one-click sequential execution of all blocks with progress bar, abort, per-block status badges, and SQLite shared context store | |
| 37 | +| **Code Execution** | 7 languages in-browser: Bash ([just-bash](https://justbash.dev/)), Math (Nerdamer), LaTeX (MathJax + Nerdamer evaluation), Python ([Pyodide](https://pyodide.org/)), HTML (sandboxed iframe, `html-autorun` for widgets/quizzes), JavaScript (sandboxed iframe), SQL ([sql.js](https://sql.js.org/) SQLite) · 25+ compiled languages via [Judge0 CE](https://ce.judge0.com): C, C++, Rust, Go, Java, TypeScript, Kotlin, Scala, Ruby, Swift, Haskell, Dart, C#, and more · **▶ Run All** notebook engine — one-click sequential execution with preflight dialog (block table with model/status), pre-execution model loading (AI + TTS auto-loaded before blocks run), progress bar, abort, per-block status badges, detailed console logging, and SQLite shared context store | |
38 | 38 | | **Security** | Content Security Policy (CSP), SRI integrity hashes, XSS sanitization (DOMPurify), ReDoS protection, Firestore write-token ownership, API keys via HTTP headers, postMessage origin validation, 8-char passphrase minimum, sandboxed code execution | |
39 | 39 | | **AI Document Tags** | `{{@AI:}}` text generation, `{{@Think:}}` deep reasoning, `{{@Image:}}` image generation (Gemini Imagen), `{{@OCR:}}` image-to-text extraction (Text/Math/Table modes via Granite Docling 258M or Florence-2 230M, PDF page rendering via pdf.js), `{{@TTS:}}` text-to-speech playback (Kokoro TTS per card, language selector, ▶ Play / ⬇ Save WAV), `{{@STT:}}` speech-to-text dictation (engine selector: Whisper/Voxtral/Web Speech API, 11 languages, Record/Stop/Insert/Clear), `{{@Translate:}}` translation (target language selector, integrated TTS pronunciation, cloud model routing), `{{@Game:}}` game builder (AI-generated or pre-built, Canvas 2D/Three.js/P5.js, import/export HTML) — `@` prefix syntax on all tag types + metadata fields (`@name`, `@use`, `@think`, `@search`, `@prompt`, `@step`, `@upload`, `@model`, `@engine`, `@lang`, `@prebuilt`); `@model:` field persists selected model per card with intelligent defaults (OCR→`granite-docling`, TTS→`kokoro-tts`, STT→`voxtral-stt`, Image→`imagen-ultra`); editable `@prompt:` textarea and `@step:` inputs in preview cards; description/prompt separation (bare text = label, `@prompt:` = AI instruction); 📎 image/PDF upload for multimodal vision analysis; per-card model selector with document-portable model persistence, concurrent block operations | |
40 | 40 | | **🔌 API Calls** | `{{API:}}` REST API integration — GET/POST/PUT/DELETE methods, custom headers, JSON body, response stored in `$(api_varName)` variables; inline review panel; toolbar GET/POST buttons | |
@@ -459,6 +459,7 @@ TextAgent has undergone significant evolution since its inception. What started |
459 | 459 |
|
460 | 460 | | Date | Commits | Feature / Update | |
461 | 461 | |------|---------|-----------------| |
| 462 | +| **2026-03-15** | | 🚀 **Run All Engine & TTS UX** — pre-execution model readiness check auto-loads all required models (AI + Kokoro TTS) before block execution starts; detailed `[RunAll]` console logging with `console.table` block summary, per-block timing, variable resolution status (✅/⚠), and completion summary; Stop button now works during model loading via `M._execAborted` cross-module flag; `ensureModelReadyAsync()` rewritten with fail-fast on missing consent/API key; compact preflight dialog (960px, smaller fonts, all 8 columns visible); `waitForModelReady()` handles Kokoro TTS via `M.tts.isKokoroReady()`; TTS card split into 3 buttons: ▶ Run (generate audio only), ▷ Play (replay stored audio), 💾 Save (download WAV); new `M.tts.generate()`, `playLastAudio()`, `isKokoroReady()`, `initKokoro()` APIs; AI model fallback in `run-requirements.js` correctly defaults to text models | |
462 | 463 | | **2026-03-14** | | 🔗 **AI Variable Controls** — new unified 🔗 Vars button on AI and Agent cards opens combined dropdown with 📤 Output Variable (text input to name the block's result) and 📥 Input Variables (checkbox picker listing declared `@var:` names from other blocks + runtime vars); variable chaining enables multi-block AI pipelines (`@var: research` → `@input: research`); declared variables appear before execution with "declared" badge; Doc Variables Panel (`{•} Vars` toolbar button) now shows ⏳ Pending Vars section for declared-but-unexecuted variables; `@var:` and `@input:` directives stripped from displayed prompt text | |
463 | 464 | | **2026-03-14** | | 🧠 **Think Mode Refinement & Multi-Select Search** — Think mode (`@think: Yes` / 🧠 toggle) now uses two-pass generation: first generates with thinking enabled, then passes the draft back to the model to add important details, examples, and missing information; removed complex ReAct pattern in favor of simple refinement; multi-select search provider dropdown on AI Generate and Agent Flow cards (checkbox pills, activate multiple engines simultaneously); search results fetched in parallel and merged | |
464 | 465 | | **2026-03-14** | | 🔑 **API Key Re-entry & Git UX** — fixed bug where incorrect cloud API keys couldn't be re-entered (dropdown re-click now re-shows key modal); "Change API Key" link in error status bar for auth failures; 🔑 key icon button on cloud model cards in DocGen setup panel with "Key Set"/"Key Required" badges; 🐙 Git toolbar button now shows centered confirmation dialog warning that local models have small context windows and cloud models (Groq, Gemini, OpenRouter) are recommended for repo analysis; Git analysis auto-opens API key modal on key/model-not-ready errors | |
|
0 commit comments