Skip to content

CreatmanCEO/notion-knowledge-assistant

Repository files navigation

Notion Knowledge Assistant

License: MIT Stars Validate Status: Production Python 3.11+ Docker

AI-powered Telegram bot that turns a Notion workspace into a queryable knowledge base. A self-hosted alternative to NotebookLM, built before NotebookLM existed.

Why this exists

Built in early 2024 for a crisis psychologist with a Notion workspace full of psychology texts, PDFs, and case studies. She needed answers, not search results — "what does Linehan say about distress tolerance?" rather than Ctrl+F.

NotebookLM didn't exist yet. ChatGPT couldn't read her Notion. The bot bridged the gap: fuzzy search over the Notion database, then GPT-4 reads the matched pages and writes a structured answer with citations back to the original Notion URLs.

It is now in daily production use as @ai_zhanna_assistant.

Production deployment

Live as @ai_zhanna_assistant — psychology / mental-health knowledge base from Жанны Травкиной's Notion. UI is in Russian, the bot is bilingual (RU/EN).

Question Answer Cited context
query answer context

How it works

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  Telegram   │────▶│ Search Engine│────▶│   Notion    │
│    User     │     │  (RapidFuzz) │     │   Database  │
└─────────────┘     └──────────────┘     └─────────────┘
       │                   │
       │                   ▼
       │            ┌──────────────┐
       │            │  AI Assistant│
       │            │   (OpenAI)   │
       │            └──────────────┘
       ◀───────────────────┘
        Structured answer + source links
  1. User asks a natural-language question in Telegram (RU or EN)
  2. Search engine runs fuzzy matching across the Notion database (text pages, PDFs, DOCX)
  3. Query optimizer reformulates queries that don't match well
  4. AI assistant reads matched content and writes a structured answer
  5. Answer is sent back with links to the original Notion pages

Tech stack

Layer Tool
Language Python 3.11+ (full async)
Telegram python-telegram-bot v21
LLM OpenAI SDK v1.x (default gpt-4o-mini)
Notion Notion API via httpx
Search RapidFuzz (fuzzy matching)
Config pydantic-settings
Logging structlog
Container Docker + docker-compose
Tests pytest

Quick start

git clone https://github.com/CreatmanCEO/notion-knowledge-assistant.git
cd notion-knowledge-assistant
python -m venv .venv
source .venv/bin/activate    # Windows: .venv\Scripts\activate
pip install -e .
cp .env.example .env         # set tokens
python -m src.main

Docker

docker-compose up -d
docker-compose logs -f

Configuration

Variable Description Required
TELEGRAM_BOT_TOKEN Telegram Bot API token Yes
OPENAI_API_KEY OpenAI API key Yes
NOTION_API_KEY Notion integration token Yes
NOTION_DATABASE_ID Target Notion database ID Yes
OPENAI_MODEL Model name (default gpt-4o-mini) No
LOG_LEVEL Logging level (default INFO) No
CACHE_TTL Cache lifetime in seconds (default 300) No
FUZZY_MATCH_THRESHOLD Min match score 0–100 (default 70) No

Bot commands

Command Description
/start Welcome message
/help Detailed help
/search <query> Direct search without AI processing
/refresh Reload data from Notion
/stats Database statistics

Notion setup

  1. Go to Notion Integrations
  2. Create a new integration, copy the Internal Integration Token
  3. Open the target database, ...Add connections → select your integration
  4. Copy the Database ID from the URL: https://notion.so/workspace/DATABASE_ID?v=...

Use cases

  • Coaches, therapists, researchers — fast retrieval from a curated knowledge base
  • Knowledge workers — query research notes and documentation
  • Students — query textbooks and study materials
  • Writers — search through reference notes

Development

pip install -e ".[dev]"
pytest
ruff check .
mypy src

Limitations

  • Single-database scope per deployment; multi-workspace requires separate instances
  • OpenAI dependency — costs scale with traffic; quality degrades with cheaper models
  • Fuzzy search has no semantic embeddings (intentional — keeps cost predictable)
  • Notion API rate limits apply; /refresh is the manual recovery path
  • No user-level access control — everyone with the bot link sees the same data
  • Russian + English tested in production; other languages untested

Related — Claude Code ecosystem by the same author

Author

Nick Podolyak

License

MIT — see LICENSE.


Originally built for a friend. Open-sourced for anyone who wants their own AI-powered knowledge base.

About

Self-hosted NotebookLM alternative for Notion: Telegram bot answers questions over your workspace with citations.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors