Posters.science is a platform for researchers to upload, share, and discover scientific conference posters. When a poster is uploaded, the platform automatically extracts structured metadata such as titles, authors, affiliations, sections, and figure captions. This makes posters findable, citable, and machine-readable.
The platform is built around FAIR principles (Findable, Accessible, Interoperable, Reusable) and integrates with Zenodo so that posters can be deposited with a DOI for long-term archival and citation.
Posters.science is developed by the FAIR Data Innovations Hub at the California Medical Innovations Institute (CalMI2).
| Resource | Description |
|---|---|
| poster2json | Python package and CLI for poster metadata extraction (docs) |
| poster-json-schema | JSON schema for machine-actionable and FAIR poster metadata (DataCite 4.7) |
| poster-json-examples | Manually annotated ground-truth poster examples |
| posters-science-extraction-api | Extraction API service used by the platform |
| poster-sentry | Lightweight multimodal scientific poster classifier |
| poster-sentry-training | Training data and scripts for the poster-sentry classifier |
| posters-science-dev-docs | Developer documentation site (live) |
| posters-science-survey | Community survey on scientific poster sharing practices |
| poster-sharing-reuse-paper-code | Analysis code for the poster sharing and reuse study |
When a user uploads a poster (PDF or image), the platform runs an automated extraction pipeline to convert the poster into structured, machine-readable metadata. Here is what happens:
flowchart TD
A["User uploads poster\n(PDF or image)"] --> B["File stored securely"]
B --> C{"File type?"}
C -->|PDF| D["Text extraction\nvia pdfalto"]
C -->|Image| E["Vision OCR\nvia Qwen2-VL"]
D --> F["Llama 3.1 8B\n(optimized for poster extraction)"]
E --> F
F --> G["Structured JSON metadata\n(poster-json-schema, DataCite 4.7)"]
G --> H["Metadata stored in database"]
H --> I["Discoverable and searchable\non the platform"]
G -.->|Optional| J["Deposit to Zenodo\nwith DOI"]
Text extraction. PDF posters are processed by pdfalto for layout-aware text extraction. Image posters (JPG, PNG) are processed by Qwen2-VL, a vision-language model that reads text directly from the image. This is handled by the extraction API.
Metadata structuring. The extracted raw text is structured into JSON by Llama 3.1 8B with parameters optimized for poster extraction. The output includes titles, authors, affiliations, content sections, and figure/table captions. See poster2json for the full extraction package.
What gets stored. The poster file is stored securely. Extracted metadata conforming to the poster-json-schema (aligned with DataCite 4.7) is stored in a PostgreSQL database and made searchable on the platform. Users can optionally deposit their poster to Zenodo for a persistent DOI.
| Layer | Technology |
|---|---|
| Frontend | Nuxt 3, Nuxt UI, Tailwind CSS |
| Backend | Nuxt server routes (Nitro) |
| Database | PostgreSQL via Prisma |
| Poster Extraction | poster2json (Python, Llama 3.1, Qwen2-VL, pdfalto) |
| File Storage | CDN-backed object storage |
| Repository Integration | Zenodo |
| Deployment | Docker |
-
Clone the repository
git clone https://github.com/fairdataihub/posters-science.git cd posters-science -
Trust and install the required tool versions
mise trust mise install
-
Install dependencies
pnpm install
-
Add your environment variables
cp .env.example .env
-
Start the development server
pnpm dev
-
Open the application at http://localhost:3000
The application uses PostgreSQL. Run it locally with Docker:
docker-compose -f ./dev-docker-compose.yaml up -dStop the database:
docker-compose -f ./dev-docker-compose.yaml downThe application uses Prisma to interact with the database.
The application uses Nuxt UI for components and Tailwind CSS for styling.
For architecture details, see the Developer Documentation.
@software{posters_science2026,
title = {Posters.science: A Platform for Sharing, Discovering, and Citing Scientific Posters},
author = {Soundarajan, Sanjay and O'Neill, James and Portillo, Dorian and Patel, Bhavesh},
year = {2026},
url = {https://github.com/fairdataihub/posters-science}
}This project is funded by The Navigation Fund (10.71707/rk36-9x79).
Distributed under the MIT License. See LICENSE for details.
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
