Real-time AI vision understanding using your webcam with SmolVLM and llama.cpp. Everything runs locally - no cloud, no API keys.
# Install llama.cpp from releases, add to PATH
# https://github.com/ggerganov/llama.cpp/releases
# Start server
llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF
# Open index.html in browser, click StartThat's it. Model auto-downloads (~300MB), runs on GPU automatically.
1. Install llama.cpp
Download from releases, extract, add to PATH.
Mac:
cd ~/Downloads
tar -xzf llama-*-bin-macos-*.tar.gz
mkdir -p ~/tools && mv llama-b* ~/tools/llama.cpp
echo 'export PATH="$HOME/tools/llama.cpp:$PATH"' >> ~/.zshrc && source ~/.zshrc
xattr -dr com.apple.quarantine ~/tools/llama.cpp/Windows: Extract to C:\Program Files\llama.cpp\, add to PATH via Environment Variables.
2. Start the server
llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUFFirst run downloads the model (~300MB). Keep terminal open.
3. Open LiveVision
open index.html # or just double-click itGrant camera permissions, click "Start Processing".
- Instruction field:
What do you see?,Count fingers,Read text, etc. - Update interval: 500ms recommended (balance of speed/smoothness)
- JSON output: Just ask for it:
Return JSON with {objects, colors, count}
- SmolVLM-500M (default): Fast, real-time capable, ~300MB
- SmolVLM-2.2B: Better quality, slower, ~1.5GB
llama-server -hf ggml-org/SmolVLM-Instruct-GGUF
Camera not working? Grant permissions, close other apps using camera
Server error? Check llama-server is running on localhost:8080
Slow? Use 500M model, increase interval to 1-2s
Mac quarantine error? Run xattr -dr com.apple.quarantine ~/tools/llama.cpp/
More help: See GUIDE.md for detailed setup, performance tips, and troubleshooting.
Point your webcam at anything, get real-time AI descriptions. Runs 100% locally on your machine. Perfect for:
- Accessibility (describe surroundings)
- Object detection and counting
- OCR (read text from camera)
- Automation (JSON outputs)
- Privacy-focused vision AI
Built with llama.cpp and SmolVLM.
Performance: M1/M2 Mac or RTX GPU = smooth real-time. CPU-only = usable but slower.
License: MIT (SmolVLM is Apache 2.0)
Issues? Check GUIDE.md or open an issue.