Skip to content

asuworks/comses-agentspace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CoMSES AgentSpace

A proof-of-concept agentic RAG application over the CoMSES Computational Model Library, built on Temporal.io.

A 5-minute setup and demo video on youtube: https://www.youtube.com/watch?v=sfjV-Id7-vg Watch the demo

What this is

A POC that lets researchers ask natural-language questions across computational model data — metadata, documentation, and source code (see sample_data/) — and get answers with paragraph-level citations back to the source material.

The agent itself is a Temporal workflow (AgentWorkflow) whose tools can be either Temporal activities (fast, mostly side-effect-free) or Temporal child workflows (multi-step, durable, with their own progress events).

Screenshots

What this is not

  • Not production-ready. No auth hardening, no rate-limiting at the public edge, etc.
  • Not a search box or a chatbot wrapper around a single model — it decomposes queries, resolves relevant models (with optional human-in-the-loop), and runs hybrid (dense + sparse) vector search before generating cited answers.

Architecture & Intent

Per-module intent.md files document the why behind each major decision:

  • intent.md — system-level rationale: agentic RAG over CoMSES, layered code structure, Temporal, worker split, event sourcing, LiteLLM proxy
  • src/modules/agent/intent.md — the conversation runtime: AgentWorkflow, three tool types, transactional outbox, context propagation
  • src/modules/ingestion/intent.md — write side: marker-pdf, synthetic Q&A enrichment, hybrid embeddings (dense + BM42 sparse), tree-sitter for code
  • src/modules/retrieval/intent.md — read side: intent analysis, query decomposition, model relevance + HITL, hybrid RRF search, source attribution, almost real-time progress

Prerequisites

Software

  • linux, wsl2, macos (didn't test)
  • Docker + Docker Compose (for the infrastructure stack)
  • ./setup.sh will install missing dependencies automatically

Hardware

  • 16 GB RAM minimum; 24 GB+ recommended (PyTorch + marker-pdf + embedding models share host memory)
  • ~10 GB disk for ML model weights and Docker images
  • GPU: NVIDIA GPU with CUDA for faster PDF parsing and embeddings.

Verified on Windows 11 laptop on WSL2 with 32 GB RAM, 8 GB VRAM (NVIDIA RTX 2000 Ada Generation)

LLM access

  • An API key from at least one provider — OpenAI, Anthropic, OpenRouter, Groq, Google — or a local Ollama instance reachable at OLLAMA_HOST. setup.sh probes the keys you supply and auto-picks the first live profile.
  • Embeddings (dense + sparse BM42) run locally via FastEmbed by default — no separate API needed. A GPU is highly recommended for embeddings computation.

Setup

./setup.sh

The script bootstraps everything in phases: toolchain install, .env generation with auto-generated secrets, Docker stack startup (Postgres, Qdrant, Redis, MinIO, Temporal, LiteLLM), database migrations, model warming, sample-data ingestion. It will prompt for LLM API KEY and llm/embeddings configuration and worker startup. When it finishes you'll have a UI at http://localhost:5173 and a sample dataset to query.

Run ./setup.sh --help for individual phase verbs (re-run a phase, recreate, etc.).

Setup phases

Each phase is idempotent (sentinel-gated) and resumable — a re-run picks up at the first incomplete or invalidated phase.

# Phase What it does
1 toolchain Detects required CLIs (node, uv, pnpm, zellij, jq, shellcheck, docker) and installs anything missing via the official installers.
2 uv_sync Runs uv sync --group pdf (and --group gpu when an NVIDIA GPU is detected). First run downloads ~2 GB (PyTorch + marker-pdf), plus ~600 MB of cuDNN/cuBLAS wheels on GPU hosts.
3 hardware_preflight Warn-only RAM / swap / CPU / GPU posture check. Suggests .env overrides for low-memory hosts (e.g. INGEST_WORKER_MAX_CONCURRENT_ACTIVITIES=2); never hard-fails.
4 env_bootstrap Creates .env from .env.example (or appends new keys to an existing one), generates per-deployment secrets (LITELLM_MASTER_KEY, MINIO_ROOT_PASSWORD, QDRANT_API_KEY, DB passwords, UI passwords).
5 app_hostnames Prompts for the public host the browser will use (default localhost; FQDN/IP for remote VMs - see Deploying CoMSES AgentSpace on a remote VM). Coherently writes CORS_ALLOWED_ORIGINS, MINIO_EXTERNAL_ENDPOINT, VITE_API_BASE_URL, VITE_WS_BASE_URL, VITE_HOST, and VITE_ALLOWED_HOSTS. RFC-1123-validates the input.
6 env_triage Detects and refuses to start when a sibling Temporal stack is already running on the same ports (7233 / 8080 / 9090 / 8085 / 16686).
7 provider_keys Probes every supported LLM provider (OpenAI, Anthropic, Groq, OpenRouter, xAI, Google, GPUStack), prompts for a key when none are alive, and asks whether embeddings should run remote (LiteLLM) or local (FastEmbed in-process).
8 marker_prewarm Pre-downloads marker-pdf layout / OCR / text-recognition models (~1.5 GB) into ~/.cache/huggingface/ so the first PDF ingest doesn't stall.
9 fastembed_prewarm Pre-downloads dense + sparse (BM42) embedding models locally. Dense is skipped when EMBEDDING_DENSE_PROVIDER=remote; sparse is always local.
10 docker_up Brings up the Temporal stack then the infra stack via docker compose up -d, then health-checks Postgres, Temporal, Redis, MinIO, Qdrant, and the LiteLLM proxy in order.
11 litellm_key Calls POST /key/generate against the running LiteLLM proxy to mint a virtual API key and writes it to LITELLM_PROXY_API_KEY in .env.
12 litellm_routing_probe Per-role smoke calls (smart / default / fast / long / embed) against the proxy. Hard-fails if no chat role responds 2xx or if embed returns no vector.
13 migrations Runs make db-check then make db-upgrade to bring the comses-rag-db schema to the latest Alembic head.
14 hosts_file Validates that the Docker DNS names workers connect to (minio, redis, qdrant, ollama, litellm-proxy, litellm-db, comses-rag-db) resolve from the host. If any are missing, offers [a]uto sudo / [m]anual / [s]kip to append 127.0.0.1 … to /etc/hosts.
15 workers Prompts you to start the 10-pane Zellij worker layout in a second terminal (make w) and polls each worker's metrics port (10090–10098) until ready.
16 sample_data Stages and ingests two bundled CoMSES codebases through the full pipeline (marker-pdf → fastembed → Qdrant + Postgres + MinIO).
17 dashboard Prints the final dashboard: service URLs + credentials, Temporal CLI hint, Zellij attach command, sample-data summary, and a "Try it" pointer at the configured host.

After setup completes

Service URL Credentials
Chat UI http://localhost:5173 API key dev-key-1 (from API_KEY_MAPPING in .env)
FastAPI http://localhost:8000
Temporal UI http://localhost:8080
Grafana http://localhost:8085 admin / $GRAFANA_ADMIN_PASSWORD
LiteLLM UI http://localhost:4000/ui admin / $LITELLM_PROXY_UI_PASSWORD
Jaeger http://localhost:16686
Prometheus http://localhost:9090
Qdrant dashboard http://localhost:6333/dashboard $QDRANT_API_KEY
MinIO Console http://localhost:9001 minio_admin / $MINIO_ROOT_PASSWORD
pgAdmin http://localhost:8888 $PGADMIN_DEFAULT_EMAIL / $PGADMIN_DEFAULT_PASSWORD
Databasus http://localhost:4005

$VAR references are auto-generated values written into .env by the env-bootstrap phase — setup.sh also prints them once on completion. Look them up in .env, not here.

Temporal CLI

docker exec -it temporal-admin-tools temporal workflow list

Workers (Zellij)

zellij attach comses-workers

Sample data

Two actual models from the CoMSES Model Library are ingested on the first run of setup.sh:

Try it

Open http://localhost:5173, log in with API key dev-key-1, and ask a multi-part question — e.g. "What ant-foraging models are in the library, and how do they differ?"

Remote VM deployment (Jetstream2, EC2, etc.)

See deployment/README.md for the full recipe — SSH-tunnel mode (recommended for solo dev) and HTTPS-via-Caddy mode (for sharing a public demo URL).

Develop

make d                   # start infrastructure (Postgres, Qdrant, Redis, MinIO, Temporal, LiteLLM)
make w                   # start all 10 Temporal workers (Zellij layout) + the chat app (backend + frontend)
make k                   # stop infra
make kw                  # kill all workers + chat app
make test                # unit tests (fast, mocked)
make test-integration    # integration tests (PMR containers)
make check               # ruff + mypy + deptry + qlty

Module-specific develop notes live in the per-module READMEs: backend/, frontend/, shared/, shared/worker_base/.

Contributing

Contributions are welcome.

Thanks

  • Temporal — the durable workflow engine that is the execution backbone of the ingestion workflows, agent runtime, every retrieval tool and the event-streaming outbox
  • marker-pdf — layout-aware PDF parsing for academic model documentation
  • Zellij — terminal multiplexer that hosts the 10-pane worker layout via make w

License

This project is released under the MIT License.

⚠️ Caveat — GPL-3.0 dependency. The PDF ingestion pipeline depends on marker-pdf (and its sub-dependency surya-ocr), both of which are licensed under GPL-3.0-or-later. While this project's own source code is MIT-licensed, anyone distributing or running the combined application with marker-pdf linked in is bound by GPL-3.0 obligations for that combined work.

About

Agentic RAG over the CoMSES model library - Temporal workflows, hybrid (dense + sparse) search, paragraph-level citations, step-level real-time progress. POC.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors