Skip to content

Latest commit

 

History

History
95 lines (67 loc) · 4.93 KB

File metadata and controls

95 lines (67 loc) · 4.93 KB

MA-LLM Pipeline — README

This repository contains the MA-LLM screening pipeline, a tool for automated screening of PubMed articles using Large Language Models (LLMs). It supports two modes:

  • Screening Selection Comparison: compare LLM-based selections with a manual gold standard.
  • Comparison-free Screening (Freeform): run PubMed searches and screen articles without a comparison set.

This README explains how to run the project from source, notes on the provided .exe (if any), required Python packages, common troubleshooting steps, and recommended small fixes and naming conventions. It is mainly intended for use when the provided .exe file does not work on your system (for example, on macOS or unsupported platforms).

Table of contents

  • Recommended usage of the pipeline
  • Project Structure
  • Quick start (source / ZIP)
  • Running the Flask UI
  • Troubleshooting (common errors including HTTP 500)
  • Notes and recommendations

Recommended usage of the pipeline

We have created an .exe of the code including the necessary packages and the .html file, thus it is recommended to download the latest release from Github and use the executable. If you are running on Linux or MacOS and you are not willing to download software to execute Windows executables you can clone the Github Repo und follow the instructions below. The functionality does not differ between the executable and the python code it is just meant to enhance usability.

Project structure (important files/folders)

  • MALLM_Pipeline/MALLM.py — main Flask app and processing logic (entrypoint)
  • MALLM_Pipeline/templates/MALLM.html — front-end UI used by the Flask app
  • MALLM_Pipeline/ExampleFiles/ — example PMIDs, prompts and gold-standard files (use these to test input formats)
  • requirements.txt — Python package list
  • Readme.md

Quick start (from source / ZIP)

  1. Clone or extract the ZIP and open a terminal in the repository root.
  2. Create and activate a Python virtual environment (recommended):
python3 -m venv .venv
source .venv/bin/activate    # zsh/bash
  1. Install dependencies:
pip install -r requirements.txt
  1. Start the Flask UI (the web UI will open automatically):
python "MALLM_Pipeline/MALLM.py"

Required Python packages

  • The project expects a set of Python packages. Ensure requirements.txt includes at least the following (add or pin versions as needed):

  • pandas

  • biopython

  • flask

  • openai (if using OpenAI provider)

  • anthropic (if using Anthropic provider)

  • google-generative-ai (if using Google provider)

  • ollama (if using Ollama)

  • openpyxl

How the web UI submits work

  • The front-end sends a POST to either /run_comparison (goldstandard mode) or /run_freeform (freeform mode).
  • The server starts processing in a background thread and the front-end polls /status for progress.

Common problems and troubleshooting

  • HTTP 500 "Submission failed: HTTP error! status: 500"

    • Meaning: the server raised an unhandled exception while processing the submission. This is a server-side error, not a front-end problem.
    • What to do: open your browser DevTools → Network, find the POST to /run_comparison or /run_freeform, and inspect the Response body. The server now returns JSON with message and traceback fields to help debugging.
    • Likely causes in this codebase:
      • Required form fields or uploaded files were missing (the server expects initial_file, goldstandard_file, and prompts_file for comparison mode).
      • prompts_file is not an Excel file or has unexpected columns; pd.read_excel will raise on invalid input.
      • AI provider initialization failed (missing provider library, unsupported provider name, or authentication error).
      • If using Ollama (Local), no API key should be required; the UI currently asks for an API key for all providers — leave the API key blank for Ollama or update the UI/server to skip the key requirement for Ollama.
  • Unclear Screening Mode error message

    • If you get an error about not choosing a screening mode, the UI will show a message. Make sure Screening Mode is set to either Screening Selection Comparison (goldstandard) or Comparison-free Screening (freeform) before submitting.
  • Problems reading prompts.xlsx

    • The pipeline expects columns named like TitlePrompt, AbstractPrompt, screen_titles, and screen_abstracts in the prompts Excel file. If your file uses different column names, rename them or adapt the code.

Frontend & server behavior notes

https://doi.org/10.1017/rsm.2026.10093

This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NoDerivatives licence (https://creativecommons.org/licenses/by-nd/4.0), which permits re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited.

© The Author(s), 2026. Published by Cambridge University Press on behalf of The Society for Research Synthesis Methodology