MA-LLM Pipeline — README

This repository contains the MA-LLM screening pipeline, a tool for automated screening of PubMed articles using Large Language Models (LLMs). It supports two modes:

Screening Selection Comparison: compare LLM-based selections with a manual gold standard.
Comparison-free Screening (Freeform): run PubMed searches and screen articles without a comparison set.

This README explains how to run the project from source, notes on the provided .exe (if any), required Python packages, common troubleshooting steps, and recommended small fixes and naming conventions. It is mainly intended for use when the provided .exe file does not work on your system (for example, on macOS or unsupported platforms).

Recommended usage of the pipeline
Project Structure
Quick start (source / ZIP)
Running the Flask UI
Troubleshooting (common errors including HTTP 500)
Notes and recommendations

Recommended usage of the pipeline

We have created an .exe of the code including the necessary packages and the .html file, thus it is recommended to download the latest release from Github and use the executable. If you are running on Linux or MacOS and you are not willing to download software to execute Windows executables you can clone the Github Repo und follow the instructions below. The functionality does not differ between the executable and the python code it is just meant to enhance usability.

Project structure (important files/folders)

MALLM_Pipeline/MALLM.py — main Flask app and processing logic (entrypoint)
MALLM_Pipeline/templates/MALLM.html — front-end UI used by the Flask app
MALLM_Pipeline/ExampleFiles/ — example PMIDs, prompts and gold-standard files (use these to test input formats)
requirements.txt — Python package list
Readme.md

Quick start (from source / ZIP)

Clone or extract the ZIP and open a terminal in the repository root.
Create and activate a Python virtual environment (recommended):

python3 -m venv .venv
source .venv/bin/activate    # zsh/bash

Install dependencies:

pip install -r requirements.txt

Start the Flask UI (the web UI will open automatically):

python "MALLM_Pipeline/MALLM.py"

Required Python packages

The project expects a set of Python packages. Ensure requirements.txt includes at least the following (add or pin versions as needed):
pandas
biopython
flask
openai (if using OpenAI provider)
anthropic (if using Anthropic provider)
google-generative-ai (if using Google provider)
ollama (if using Ollama)
openpyxl

How the web UI submits work

The front-end sends a POST to either /run_comparison (goldstandard mode) or /run_freeform (freeform mode).
The server starts processing in a background thread and the front-end polls /status for progress.

Common problems and troubleshooting

HTTP 500 "Submission failed: HTTP error! status: 500"
- Meaning: the server raised an unhandled exception while processing the submission. This is a server-side error, not a front-end problem.
- What to do: open your browser DevTools → Network, find the POST to /run_comparison or /run_freeform, and inspect the Response body. The server now returns JSON with message and traceback fields to help debugging.
- Likely causes in this codebase:
  - Required form fields or uploaded files were missing (the server expects initial_file, goldstandard_file, and prompts_file for comparison mode).
  - prompts_file is not an Excel file or has unexpected columns; pd.read_excel will raise on invalid input.
  - AI provider initialization failed (missing provider library, unsupported provider name, or authentication error).
  - If using Ollama (Local), no API key should be required; the UI currently asks for an API key for all providers — leave the API key blank for Ollama or update the UI/server to skip the key requirement for Ollama.
Unclear Screening Mode error message
- If you get an error about not choosing a screening mode, the UI will show a message. Make sure Screening Mode is set to either Screening Selection Comparison (goldstandard) or Comparison-free Screening (freeform) before submitting.
Problems reading prompts.xlsx
- The pipeline expects columns named like TitlePrompt, AbstractPrompt, screen_titles, and screen_abstracts in the prompts Excel file. If your file uses different column names, rename them or adapt the code.

Frontend & server behavior notes

https://doi.org/10.1017/rsm.2026.10093

This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NoDerivatives licence (https://creativecommons.org/licenses/by-nd/4.0), which permits re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MA-LLM Pipeline — README

Table of contents

Recommended usage of the pipeline

Project structure (important files/folders)

Quick start (from source / ZIP)

Required Python packages

How the web UI submits work

Common problems and troubleshooting

Frontend & server behavior notes

FilesExpand file tree

Readme.md

Latest commit

History

Readme.md

File metadata and controls

MA-LLM Pipeline — README

Table of contents

Recommended usage of the pipeline

Project structure (important files/folders)

Quick start (from source / ZIP)

Required Python packages

How the web UI submits work

Common problems and troubleshooting

Frontend & server behavior notes