TubeMind is a board-based research app for learning from YouTube. Instead of treating each question as a one-off search, it groups related questions into topic-bound boards, pulls transcript evidence from relevant videos, indexes that material with LightRAG, and turns each answer into a reusable note with linked source evidence.
The current app is a server-rendered FastHTML experience with a redesigned premium UI, persistent light/dark theme toggle, Google OAuth or demo auth, durable SQLite state, and Railway-ready deployment support.
- Creates a new board automatically from your first question.
- Keeps follow-up questions inside the same topic region instead of starting from scratch each time.
- Searches YouTube for caption-friendly, embeddable videos when the current board does not already have enough evidence.
- Fetches transcripts, normalizes them, caches them on disk, and indexes them into a per-board LightRAG knowledge base.
- Generates note answers backed by transcript chunks, then stores those notes so the board becomes more useful over time.
- Lets you open note detail pages with evidence excerpts, linked timestamps, and the original search queries that expanded the board.
- A user asks a question in a board.
- TubeMind first queries the existing board corpus.
- If the board does not have enough evidence yet, TubeMind plans or falls back to YouTube search queries.
- It searches YouTube for videos that are more likely to work well in hosted environments.
- It fetches transcripts using layered fallbacks:
TranscriptAPIyoutube-transcript-apiyt-dlpsubtitle download fallback
- It stores cleaned transcript artifacts under the app data directory and indexes them into that board's LightRAG store.
- It answers the question, stores the note, stores the source chunks, and refreshes the board summary over time.
Each board has its own transcript cache and LightRAG working directory, so follow-up notes stay grounded in the same topic instead of polluting a single global corpus.
- FastHTML for the server-rendered app and route layer
- HTMX for incremental UI interactions
- LightRAG for retrieval and graph-backed indexing
- OpenAI API for planning, synthesis, and answer generation
- YouTube Data API v3 for video search and metadata
- TranscriptAPI plus transcript fallbacks for transcript acquisition
- SQLite for durable user, board, note, and evidence metadata
uvfor dependency management and running the app
The main files are:
tubemind/routes.py: app factory, routes, theme bootstrap, auth guards, health endpointtubemind/ui.py: server-rendered UI builders for login, workspace, note detail, topbar, and theme toggletubemind/services.py: board runtime orchestration, YouTube search, transcript fetching, indexing, retrieval, answer generationtubemind/auth.py: Google OAuth helpers, demo auth, SQLite tables, board persistencetubemind/config.py: environment loading, app constants, path configurationstatic/tubemind.css: full visual system for light and dark themestubemind/__main__.py:python -m tubemindentrypoint
- Python 3.12+
uv- OpenAI API key
- YouTube Data API key
- TranscriptAPI key
- Google OAuth credentials if you want Google login locally
Create a .env file in the repo root. At minimum, local development needs:
OPENAI_API_KEY=your_openai_key
OPENAI_MODEL=gpt-4.1-nano
YOUTUBE_API_KEY=your_youtube_api_key
TRANSCRIPTAPI_API_KEY=your_transcriptapi_key
BASE_URL=http://localhost:5001
SESSION_SECRET=any-long-random-stringIf you want Google OAuth locally, also set:
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secretOptional variables:
DEMO_AUTH_ENABLED=false
DEMO_USER_ID=demo-user
DEMO_USER_NAME=Coursework Demo
DEMO_USER_EMAIL=demo@tubemind.local
DEMO_USER_PICTURE=
TUBEMIND_DATA_DIR=.local
YOUTUBE_TRANSCRIPT_COOKIES_FILE=
YOUTUBE_COOKIES_BROWSER=
PORT=5001cd TubeMind
UV_CACHE_DIR=.local/uv-cache uv sync
UV_CACHE_DIR=.local/uv-cache uv run python -m tubemindOpen:
http://127.0.0.1:5001
Stop the server with Ctrl+C.
TubeMind supports two sign-in modes:
- Google OAuth:
- best for normal usage
- requires
GOOGLE_CLIENT_ID,GOOGLE_CLIENT_SECRET, and a matchingBASE_URL
- Demo auth:
- best for coursework demos or simpler hosted deployments
- enable with
DEMO_AUTH_ENABLED=true - creates a synthetic local user session without Google sign-in
If both are configured, the login page shows both options.
TubeMind stores two kinds of state:
- SQLite app state:
- users
- boards
- notes
- board search queries
- note evidence chunks
- indexed video metadata
- Board filesystem state:
- transcript artifacts
- per-board LightRAG working directories
By default this lives under TUBEMIND_DATA_DIR. In production, that directory should be mounted to persistent storage.
Railway is the recommended hosted path for this repo.
- Push the repo to GitHub.
- Create a Railway service from the repo.
- Add a volume to the same service.
- Mount the volume at:
/data/tubemind
- Set:
TUBEMIND_DATA_DIR=/data/tubemindThat path is correct for the current production setup.
OPENAI_API_KEY=...
OPENAI_MODEL=gpt-4.1-nano
YOUTUBE_API_KEY=...
TRANSCRIPTAPI_API_KEY=...
BASE_URL=https://your-service.up.railway.app
SESSION_SECRET=choose-a-long-random-string
TUBEMIND_DATA_DIR=/data/tubemindFor Google OAuth:
GOOGLE_CLIENT_ID=...
GOOGLE_CLIENT_SECRET=...
DEMO_AUTH_ENABLED=falseFor coursework/demo mode:
DEMO_AUTH_ENABLED=trueIf yt-dlp needs cookies to get around YouTube bot checks on hosted infrastructure, add:
YOUTUBE_TRANSCRIPT_COOKIES_FILE=/data/tubemind/youtube-cookies.txtDo not commit youtube-cookies.txt to GitHub. Upload it only to the mounted Railway volume.
The app exposes:
/health
It returns:
{"ok": true}- The app reads Railway's injected
PORTautomatically. - The Docker image runs
python -m tubemind. - The stylesheet URL is cache-busted so CSS changes deploy more reliably.
Transcript fetching is intentionally layered because hosted deployments are less forgiving than local machines.
Primary path:
TranscriptAPIusing the YouTube-specific transcript endpoint
Fallbacks:
youtube-transcript-apiyt-dlpsubtitle download
TubeMind also prefers caption-friendly and embeddable YouTube search results to improve transcript success on Railway.
- Never commit
youtube-cookies.txt. - Rotate any API keys or OAuth secrets that were accidentally exposed.