Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions docs/docs/cogstack-ce/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# CogStack Community Edition (CogStack CE)

CogStack Community Edition (CogStack CE) is a one line installation of the open source apps, AI products and Data Engineering tools in CogStack.

The installation is preconfigured with default data sets, configurations, and example dashboards already setup for you.

The CE aims to show what is possible with the open source CogStack products, and give you ideas of how you could build upon this and integrate with your real data.

## What is CogStack CE?

**CogStack CE** is an all-in-one Kubernetes deployment for clinical NLP workflows.

It combines model serving, de-identification, model training, notebook-based analysis, and search tooling into one Helm release.

| Product | Primary use case |
| --- | --- |
| MedCAT service | Extract medical concepts and entities from free text. |
| AnonCAT service | De-identify clinical text for safer downstream use. |
| MedCAT Trainer | Train, tune, and manage MedCAT models. |
| JupyterHub | Run notebooks for experimentation and end-to-end workflows. |
| OpenSearch + Dashboards | Index, search, and explore operational or NLP data. |

## Where to start

1. [Tutorial: Quickstart](./tutorial/quickstart-installation.md)
2. [Tutorial: End To End Tutorial](./tutorial/end-to-end-jupyterhub.md)

## Installation and customization (reference)

For the full installation reference, deployment instructions, and customizations, see:

- [Deployment](../platform/deployment/_index.md)
- [CogStack CE Helm chart (install + customization)](../platform/deployment/helm/charts/cogstack-ce-helm.md)

## Models
The default installation comes with basic models that are just for demo purposes.

There are public models available, that will require a NIH profile or UMLS license. See [MedCAT](https://github.com/CogStack/cogstack-nlp/tree/main/medcat-v2) documentation for how to get these models.

!!! tip
For access to high performing models trained on real world clinical datasets, contact us

## Next Steps
After setting up and trying CogStack Community edition, you can look into the details and wider tools in the platform

- [Getting Started with CogStack Platform](../overview/getting-started.md)
11 changes: 11 additions & 0 deletions docs/docs/cogstack-ce/tutorial/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Tutorials

This section walks you through running CogStack CE locally and then using the deployed JupyterHub.

## Quickstart

- [Tutorial: Quickstart (Helm install + port-forward + open JupyterHub)](./quickstart-installation.md)

## Operate JupyterHub

- [Tutorial: Open and operate JupyterHub](./end-to-end-jupyterhub.md)
89 changes: 89 additions & 0 deletions docs/docs/cogstack-ce/tutorial/end-to-end-jupyterhub.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# End-to-end JupyterHub tutorial

In this tutorial, you will open JupyterHub and run a notebook that calls deployed CogStack CE services.

By the end, you will have completed an end-to-end user flow:

1. Open JupyterHub
2. Log in
3. Open the bundled tutorial notebook
4. Run cells that call MedCAT and AnonCAT service APIs
5. Inspect the outputs

## Before you start

Make sure your CogStack CE release is installed and local port-forwarding is running.

If needed, re-run:

```sh
helm get notes <release> | bash
```

Replace `<release>` with your Helm release name (for example, `cogstack`).

## Step 1: Open JupyterHub

Open:

- http://127.0.0.1:8000

This should show the JupyterHub login page.

## Step 2: Log in

The community chart uses a dummy authenticator (for local/non-production use).

Log in with:

- Username: `admin`
- Password: `SuperSecret`

After login, JupyterLab opens for your user.

## Step 3: Open the bundled notebook

The chart includes an example notebook:

- `medcat-service-tutorial.ipynb`

You can open it directly:

- http://127.0.0.1:8000/user/admin/notebooks/medcat-service-tutorial.ipynb

Or navigate to it in JupyterLab and click to open it.

## Step 4: Run the notebook cells

Run each cell in order from top to bottom.

The notebook demonstrates service calls to:

- `medcat-service` at `/api/process` for named entity extraction
- `anoncat-service` at `/api/process` for de-identification

It uses environment variables for service URLs where available, so the default CogStack CE setup should work without edits.

## Step 5: Verify the end-to-end outputs

As you run cells, confirm that:

- MedCAT returns annotation output for the sample text
- AnonCAT returns de-identified output
- The JSON responses are displayed in the notebook

If those outputs appear, you have validated the full end-to-end flow from JupyterHub to deployed CogStack CE services.

## Troubleshooting

- If JupyterHub does not load, ensure port-forwarding is running.
- If notebook requests fail, verify the cluster services are up and re-run:
- `helm get notes <release> | bash`
- For production deployments, replace dummy authentication with secure auth configuration.


## Next Steps

- See the [full deployment documentation](../../platform/deployment/_index.md) for more details on scaling, production security, and advanced configuration options.
- See full install instructions of the cogstack CE chart[CogStack CE Helm chart (install + customization)](../../platform/deployment/helm/charts/cogstack-ce-helm.md)
- See further tutorials on medcat on [GitHub](https://github.com/CogStack/cogstack-nlp/tree/79f00cfc204f4ae559b56c8e397bbcaf82d44274/medcat-v2-tutorials)
63 changes: 63 additions & 0 deletions docs/docs/cogstack-ce/tutorial/quickstart-installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Quickstart

This tutorial installs CogStack CE using Helm, sets up port-forwarding, and opens the bundled JupyterHub in your browser.

The install should take around 15 minutes, and by the end of this tutorial you will have a fully working and integrated CogStack environment that you can start using.

## Prerequisites

- A Kubernetes cluster
- Helm 3+

## 1. Install CogStack CE

Run:

```sh
helm install cogstack oci://registry-1.docker.io/cogstacksystems/cogstack-helm-ce --timeout 15m
```

This command will install Cogstack community edition with all the default values.

!!! warning
For brand new installations, this might take a while, so expect up to 15 minutes. It needs to download many GB of docker images first and then startup processes.

Once the initial installation is done, then any updates should be significantly faster.

The defaults are set for a production-ready environment. See [Deployment](../../platform/deployment/_index.md) for detailed deployment information and customization options.

## Port-forward and open JupyterHub

1. Set up the port-forwarding endpoints for the services. This has been scripted up in the helm notes for you.

<!-- termynal -->

```sh
$ helm get notes cogstack | bash
bash: line 1: NOTES:: command not found
Forwarding from 127.0.0.1:5001 -> 5000
Forwarding from [::1]:5001 -> 5000
Forwarding from 127.0.0.1:5000 -> 5000
Forwarding from [::1]:5000 -> 5000
# ...
Visit http://127.0.0.1:5000 to use MedCAT Service
Visit http://127.0.0.1:5001 to use AnonCAT
Visit http://127.0.0.1:8080 to use MedCAT Trainer
Visit https://127.0.0.1:9200 to use OpenSearch
Visit http://127.0.0.1:5601 to use OpenSearch Dashboards
Visit http://127.0.0.1:8000 to use jupyterhub
Visit http://127.0.0.1:8000/user/admin/notebooks/medcat-service-tutorial.ipynb to get started with a tutorial
```

This command runs `kubectl port-forward` in the background.

If you use a custom namespace or Helm release name, add the namespace or replace `cogstack` in the commands above accordingly.

2. Open JupyterHub by opening http://127.0.0.1:8000 in a web browser

Jupyter should start up, and present a login screen.


## Next step: run the bundled notebook

Continue with: [Tutorial: Open and operate JupyterHub](./end-to-end-jupyterhub.md).
73 changes: 63 additions & 10 deletions docs/docs/index.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,83 @@
Welcome to the CogStack Documentation site.

## What is CogStack?
![CogStack Architecture](overview/attachments/architecture.png)

CogStack is a lightweight distributed, fault tolerant database processing architecture and ecosystem, intended to make NLP processing and preprocessing easier in resource constrained environments. It comprises of multiple components, and has been designed to provide configurable data processing pipelines for working with EHR data.
CogStack lets you unlock the power of healthcare data.

CogStack uses databases and files as primary sources of EHR data, with support for custom data connectors. The platform leverages [Apache NiFi](https://nifi.apache.org/) to provide fully configurable data processing pipelines with the goal of generating annotated JSON standardised schema files that can be readily indexed into [ElasticSearch](https://www.elastic.co/), stored as files or pushed back to a database.
CogStack is a healthcare suite with interchangeable modules for analysing clinical data using AI to draw insights from text in or documents in an Electronic Health Records.

There are a wide range of features including Generative AI, Natural Language Processing, Full Search, Alerting, Cohort Selection, Population Health Dashboards, Deep Phenotyping and Clinical Research.

CogStack is a commercial open-source product, with the code available on GitHub: [https://github.com/CogStack/](https://github.com/CogStack/) . For enterprise deployments, full platform setup, and advanced features, please [contact us](https://docs.cogstack.org/en/latest/).
CogStack is a commercial open-source product, with the code for the community edition available on GitHub: [https://github.com/CogStack/](https://github.com/CogStack/). For enterprise deployments, full platform setup, and advanced features, please [contact us](https://docs.cogstack.org/en/latest/).

!!! tip
## Demo
Try a demo of the MedCAT natural language processing tool on [https://medcat.app.cogstack.org/](https://medcat.app.cogstack.org/)

This tool demonstrates named entity resolution on patient records to extract SNOMED clinical terms. This can be integrated for clinical coding and search applications.

!!! warning
Do not put real patient data into this demo page
## Quickstart

CogStack is designed as a microservices-based ecosystem. The recommended deployment method is on **Kubernetes using Helm charts**, which provides cloud-native support, scalability, and reliability. Ready-to-use CogStack images are available from the official Docker Hub under the [cogstacksystems](https://hub.docker.com/u/cogstacksystems/) organisation. Docker Compose is still supported for development and smaller deployments, but Kubernetes is recommended for production environments.
Deploy the CogStack Community Edition on an existing Kubernetes cluster using helm.

## What is CogStack For?
<!-- termynal -->

CogStack consists of a range of technologies designed to support modern, open source healthcare analytics, and is chiefly comprised of the Elastic stack ([ElasticSearch](https://www.elastic.co/products/elasticsearch), [Kibana](https://www.elastic.co/products/kibana), etc.), [MedCAT](https://github.com/CogStack/MedCAT) (clinical natural language processing for named entity extraction and linking, contextualization, and realtion extraction), clinical text [OCR](https://github.com/CogStack/ocr-service), and clinical text de-identification. Since the processed EHR data can be represented and stored in databases or ElasticSearch, CogStack can be perfectly utilised as one of the solutions for integrating EHR data with other types of biomedical, -omics, wearables data, etc.
```sh
$ helm install \
cogstack oci://registry-1.docker.io/cogstacksystems/cogstack-helm-ce \
--timeout=15m0s
---> 100%
Pulled: registry-1.docker.io/cogstacksystems/cogstack-helm-ce:0.0.1
Digest: sha256:02e8ad3df7173270f7fdeb3e1ed5133427cec06ffc15b4ce763fa9bb062c8df1

---
NAME: cogstack
LAST DEPLOYED: Mon Mar 23 16:19:05 2026
NAMESPACE: default
STATUS: deployed
REVISION: 1
DESCRIPTION: Install complete
NOTES:
...
# CogStack Community Edition is installed
# Setup Complete
# Run this command line to setup port-forwarding and access services
# `helm get notes cogstack | bash`
```

See [CogStack Community Edition (CE)](cogstack-ce/_index.md) to continue this process.

## Community and support

- **Questions?** Reach out in the [CogStack community forum](https://discourse.cogstack.org/).
- **Code and projects:** [CogStack on GitHub](https://github.com/orgs/CogStack/repositories).


### Architecture

![CogStack Architecture](overview/attachments/architecture.png)

CogStack is comprised of a suite of applications, all using a common AI and data engineering platform. It is designed to be a self hosted platform where you run your own instances and keep all of your data on premise, with full support for air gapped environments.

The applications provide features for:
- Clinical Coding
- Search and Audit of EHRs
- Cohorting
- EHR Analytics
- DeIdentification of patient records
- Clinical Decision Support (CDS)

The AI and Data Engineering layer comprises of:
- Healthcare Language Models trained on large data real world data sets
- The open source MedCAT and AnonCAT natural language processing libraries
- Data Engineering pipelines using Apache NiFi and OpenSearch to read unstructured and structured data
- MLOps tooling for model training and validation

!!! tip
Many of these apps and tools are open source and available on GitHub (subject to the licensing in each project), in the [CogStack GitHub](https://github.com/CogStack)
The public documentation on this page covers these open source community offerings.
For advanced features and enterprise level features see [products](https://cogstack.org/products/).

## Next Steps

[Get Started ](overview/getting-started.md){ .md-button .md-button--primary }
[Get Started ](overview/getting-started.md){ .md-button .md-button--primary }
5 changes: 5 additions & 0 deletions docs/docs/platform/deployment/helm/charts/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,14 @@ The Helm charts for CogStack are published to Docker Hub, which is an OCI-compli
- **MedCAT Trainer:**
https://hub.docker.com/r/cogstacksystems/medcat-trainer-helm

- **CogStack CE (Community Edition, umbrella chart):**
https://hub.docker.com/r/cogstacksystems/cogstack-helm-ce

- [MedCAT Service Helm](medcat-service-helm.md)
- [MedCAT Trainer Helm](medcat-trainer-helm.md)

- [CogStack CE Helm](cogstack-ce-helm.md)

## Chart Publishing

Charts are published automatically via a GitHub Action on every commit to the main branch.
Loading
Loading