Launchstack - Professional Document Reader AI

Launchstack is a Next.js platform for role-based document management, AI-assisted Q&A, and predictive document analysis. It combines document upload, optional OCR, embeddings, and retrieval to help teams find gaps and act faster.

Core Features

Clerk-based Employer/Employee authentication with role-aware middleware.
Document upload pipeline with optional OCR for scanned PDFs.
PostgreSQL + pgvector semantic retrieval for RAG workflows.
AI chat and predictive document analysis over uploaded content.
Agent guardrails with PII filtering, grounding checks, and confidence gating.
Supervisor agent that validates outputs against domain-specific rubrics.
Marketing pipeline with content generation for Reddit, X, LinkedIn, and Bluesky.
Optional web-enriched analysis with Tavily.
Optional reliability/observability via Inngest and LangSmith.

Predictive Analysis — Supported Document Types

Launchstack runs domain-specific analysis tailored to your document type:

Type	What It Detects
Contract	Missing exhibits, schedules, addendums, and supporting agreements
Financial	Missing balance sheets, audit reports, income statements
Technical	Missing specifications, manuals, diagrams, deliverables
Compliance	Missing regulatory filings, certifications, policy documents
Educational	Missing syllabi, handouts, readings, linked resources
HR	Missing policies, forms, benefits materials, handbooks
Research	Missing cited papers, datasets, supplementary materials
General	Any document with cross-references and attachments

Each analysis type also extracts insights (deadlines, action items, resources, caveats) and runs chain-of-verification on high-priority predictions.

Importing External Knowledge

Launchstack can ingest content exported from third-party tools. No API keys or OAuth setup required — export your data, upload the files, and the ingestion pipeline handles the rest.

Supported Export Formats

Source	Export Method	Resulting Format	Launchstack Adapter
Notion	Settings > Export > Markdown & CSV	`.md`, `.csv` (ZIP)	TextAdapter, SpreadsheetAdapter
Notion	Page > Export > HTML	`.html`	HtmlAdapter
Google Docs	File > Download > Microsoft Word	`.docx`	DocxAdapter
Google Sheets	File > Download > CSV or Excel	`.csv`, `.xlsx`	SpreadsheetAdapter
Google Drive	Google Takeout (takeout.google.com)	`.docx` (ZIP)	DocxAdapter
Slack	Workspace Settings > Import/Export > Export	`.json` (ZIP)	JsonExportAdapter
GitHub	Code > Download ZIP	`.md`, `.txt` (ZIP)	TextAdapter
GitHub	`gh issue list --json ...`	`.json`	JsonExportAdapter
GitHub	`gh pr list --json ...`	`.json`	JsonExportAdapter

How to Export

Notion

Open your Notion workspace.
Click the ... menu on a page, or go to Settings & members > Export for a full workspace export.
Select Markdown & CSV as the format and check Include subpages if needed.
Download the ZIP and upload it directly to Launchstack.

Google Docs / Sheets

Open the document in Google Docs or Sheets.
Go to File > Download and choose Microsoft Word (.docx) or CSV / Excel (.xlsx).
Upload the downloaded file. For bulk exports, use Google Takeout to export your Drive as a ZIP.

Slack

Go to Workspace Settings > Import/Export Data > Export.
Choose a date range and start the export.
Download the ZIP and upload it to Launchstack. Each channel's messages will be ingested as a separate document.

GitHub

Repo docs: Click Code > Download ZIP on any GitHub repository. Upload the ZIP — all Markdown and text files will be ingested.

Issues: Install the GitHub CLI and run:

gh issue list --state all --limit 1000 --json number,title,body,state,labels,author,createdAt,closedAt,comments > issues.json

Upload the resulting issues.json file.

Pull requests: Run:

gh pr list --state all --limit 1000 --json number,title,body,state,labels,author,createdAt,mergedAt,comments > prs.json

Upload the resulting prs.json file.

All uploaded content flows through the standard ingestion pipeline (chunking, embedding, RAG indexing) and becomes searchable alongside your other documents.

Architecture

Launchstack follows a three-layer modular architecture:

block-beta
  columns 9

  SLABEL["Services\nLayer"]:1
  MKT["Marketing Engine\n─────────────\nTrend Analysis\nContent Generation\nWeb Scraping Jobs"]:2
  LEG["Legal Services\n─────────────\nTemplate Library\nAuto-Fill & Clauses\nLegal Vault"]:2
  ONB["Employee Onboarding\n─────────────\nOnboarding Agent\nQuizzes & Checks\nProgress Tracking"]:2
  DOCR["Document Reasoning\n─────────────\nPage Index & TOC\nRLM Agent\nKnowledge Graph"]:2

  space:9

  TLABEL["Tools\nLayer"]:1
  RAG["RAG Pipeline\n(BM25 + Vector)"]:2
  WEB["Web Search\n(Tavily, Firecrawl)"]:2
  REW["Doc Rewrite\n(Summarize, Refine)"]:2
  TMPL["Template Engine\n(Form → PDF)"]:2
  space:1
  ING["Doc Ingestion\n(OCR, Chunk, Embed)"]:4
  ENT["Entity Extraction\n(NER, Graph RAG)"]:4

  space:9

  PLABEL["Physical\nLayer"]:1
  DB["PostgreSQL + pgvector\n─────────────\nEmbeddings Index\nDocument Structure\nKnowledge Graph\nDomain Tables"]:2
  HOST["Hosting & Compute\n─────────────\nNext.js 15\nInngest Jobs\nAgent Hosting\nML Sidecar"]:2
  EXT["External Services\n─────────────\nOCR Providers\nFile Storage (S3)\nClerk Auth + RBAC"]:2
  KBS["Knowledge Bases\n─────────────\nCompany KB\nLegal Templates\nOnboarding Docs"]:2

  %% Service → Tool edges
  MKT --> RAG
  MKT --> WEB
  MKT --> REW
  LEG --> RAG
  LEG --> REW
  LEG --> TMPL
  ONB --> RAG
  ONB --> REW
  DOCR --> RAG
  DOCR --> WEB
  DOCR --> REW
  DOCR --> ING
  DOCR --> ENT

  %% Tool → Physical edges
  RAG --> DB
  RAG --> KBS
  WEB --> HOST
  REW --> HOST
  TMPL --> EXT
  TMPL --> KBS
  ING --> DB
  ING --> EXT
  ING --> HOST
  ENT --> DB
  ENT --> HOST

  classDef layer fill:#1a1a2e,color:#eee,stroke:none
  classDef svc fill:#4A90D9,color:#fff,stroke:#2C5F8A,stroke-width:1px
  classDef tool fill:#F5A623,color:#fff,stroke:#C47D0E,stroke-width:1px
  classDef phys fill:#27AE60,color:#fff,stroke:#1E8449,stroke-width:1px

  class SLABEL,TLABEL,PLABEL layer
  class MKT,LEG,ONB,DOCR svc
  class RAG,WEB,REW,TMPL,ING,ENT tool
  class DB,HOST,EXT,KBS phys

The platform is organized into:

Services Layer - Vertical business modules (Marketing, Legal, Onboarding, Document Reasoning)
Tools Layer - Reusable AI capabilities (RAG, Web Search, Document Processing, Entity Extraction)
Physical Layer - Infrastructure (PostgreSQL + pgvector, Next.js hosting, External services, Knowledge bases)

All services operate within domain-partitioned boundaries enforced by Clerk RBAC. RAG queries are scoped by domain + company_id to ensure data isolation.

Tech Stack

Next.js 15 + TypeScript
PostgreSQL + Drizzle ORM + pgvector
Clerk authentication
OpenAI + LangChain
UploadThing + optional OCR providers
Tailwind CSS

Prerequisites

Node.js 18+
pnpm
Docker + Docker Compose (recommended for local DB/full stack)
Git

Quick Start

1) Clone and install

git clone <repository-url>
cd pdr_ai_v2-2
pnpm install

2) Configure environment

Create .env from .env.example and fill required values:

DATABASE_URL
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY
CLERK_SECRET_KEY
BLOB_READ_WRITE_TOKEN (Vercel Blob read/write token)
OPENAI_API_KEY
INNGEST_EVENT_KEY, as placeholder

Optional integrations:

NODE_ENV=development (for development, otherwise assumed to be production)
UPLOADTHING_TOKEN
TAVILY_API_KEY
INNGEST_EVENT_KEY, INNGEST_SIGNING_KEY
AZURE_DOC_INTELLIGENCE_ENDPOINT, AZURE_DOC_INTELLIGENCE_KEY
LANDING_AI_API_KEY, DATALAB_API_KEY
LANGCHAIN_TRACING_V2, LANGCHAIN_API_KEY, LANGCHAIN_PROJECT
DEBUG_PERF (1 or true) to enable dev perf logs for middleware and key auth/dashboard APIs
SIDECAR_URL
NEO4J_URI
NEO4J_USERNAME
NEO4J_PASSWORD

2.1) Configure Vercel Blob Storage

Vercel Blob is used for storing uploaded documents. Both public and private stores are supported -- the upload logic auto-detects which mode the store uses and adapts automatically.

In the Vercel dashboard, go to Storage → Blob → Create Store.
Choose either Public or Private access. Both work:
- Public stores produce URLs the browser can load directly (faster for previews).
- Private stores keep files behind authentication; the app proxies content through /api/documents/[id]/content and /api/files/[id] so previews still work.
Generate a Read/Write token for the store and add it as BLOB_READ_WRITE_TOKEN in your environment (.env locally, or Vercel Project Settings for deploys).
Redeploy so the token is available at build and runtime.
Verify: sign in to the Employer Upload page, upload a small PDF, and confirm /api/upload-local returns a vercel-storage.com URL without errors.

3) Start database and apply schema

pnpm db:push

4) Run app

pnpm run dev

Open http://localhost:3000.

Docker Deployment Methods

Method 1: Full stack (recommended)

Runs db + migrate + app via Compose:

docker compose --env-file .env --profile dev up

Detached mode:

docker compose --env-file .env --profile dev up -d

Method 2: App container only (external DB)

Use this when your database is managed externally.

docker build -t pdr-ai-app .
docker run --rm -p 3000:3000 \
  -e DATABASE_URL="$DATABASE_URL" \
  -e CLERK_SECRET_KEY="$CLERK_SECRET_KEY" \
  -e NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY="$NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY" \
  -e OPENAI_API_KEY="$OPENAI_API_KEY" \
  pdr-ai-app

Method 3: DB container only (host app)

docker compose --env-file .env up -d db
pnpm dev

For host DB tools, use localhost:5433.

How Docker Supports Platform Features

app service runs auth, upload, OCR integration, RAG chat, and predictive analysis.
db service provides pgvector-backed storage/retrieval for embeddings.
migrate service ensures schema readiness before app startup.
Optional providers (Inngest, Tavily, OCR, LangSmith) are enabled by env vars in the same runtime.

Documentation

Deployment details (Docker, Vercel, VPS): docs/deployment.md
Feature workflows and architecture: docs/feature-workflows.md
Usage and API examples: docs/usage-examples.md
Observability and metrics: docs/observability.md
Manual testing (dev, post-PR): docs/manual-testing-guide.md

API Endpoints (high-level)

POST /api/uploadDocument - upload and process document (OCR optional)
POST /api/LangChain - document-grounded Q&A
POST /api/agents/predictive-document-analysis - detect gaps and recommendations
GET /api/metrics - Prometheus metrics stream

User Roles

Employee: view assigned documents, use AI chat/analysis.
Employer: upload/manage documents, categories, and employee access.

Useful Scripts

pnpm db:studio
pnpm db:push
pnpm check
pnpm lint
pnpm typecheck
pnpm build
pnpm start

Roadmap — Future Integrations

Notion API-key connector: Paste your Notion Internal Integration token in settings, select pages to sync. No OAuth required. Contributions welcome.
GitHub webhook sync: Automatically ingest new issues and PRs via repository webhooks.
Google Drive watch: Automatic re-sync when Google Docs are updated, using Drive push notifications.

Troubleshooting

Confirm Docker is running before DB startup.
If build issues occur: remove .next and reinstall dependencies.
If OCR UI is missing: verify OCR provider keys are configured.
If Docker image pull/build is corrupted: remove image and rebuild with --no-cache.

Contributing

Create a feature branch.
Make changes and run pnpm check.
Open a pull request with test notes.

📝 License

Private and proprietary.

📞 Support

Open an issue in this repository or contact the development team.

Name		Name	Last commit message	Last commit date
Latest commit History 582 Commits
.github/workflows		.github/workflows
__mocks__		__mocks__
__tests__		__tests__
docker		docker
docs		docs
drizzle		drizzle
public		public
scripts		scripts
sidecar		sidecar
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.prettierrc.yml		.prettierrc.yml
.vercelignore		.vercelignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
PREDICTIVE_DOCUMENT_ANALYSIS.md		PREDICTIVE_DOCUMENT_ANALYSIS.md
README.md		README.md
dev-output.log		dev-output.log
docker-compose.yml		docker-compose.yml
drizzle.config.ts		drizzle.config.ts
eslint.config.js		eslint.config.js
jest.babel.config.cjs		jest.babel.config.cjs
jest.config.js		jest.config.js
next.config.ts		next.config.ts
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
prettier.config.js		prettier.config.js
qodana.yaml		qodana.yaml
start-database.sh		start-database.sh
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Folders and files

Latest commit

History

Repository files navigation

Launchstack - Professional Document Reader AI

Core Features

Predictive Analysis — Supported Document Types

Importing External Knowledge

Supported Export Formats

How to Export

Architecture

Tech Stack

Prerequisites

Quick Start

1) Clone and install

2) Configure environment

2.1) Configure Vercel Blob Storage

3) Start database and apply schema

4) Run app

Docker Deployment Methods

Method 1: Full stack (recommended)

Method 2: App container only (external DB)

Method 3: DB container only (host app)

How Docker Supports Platform Features

Documentation

API Endpoints (high-level)

User Roles

Useful Scripts

Roadmap — Future Integrations

Troubleshooting

Contributing

📝 License

📞 Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages