Integrative multi-omics analysis of ER-positive, HER2-negative breast cancer
This repository contains code, notebooks, and workflows supporting the study:
Mosquim Junior, S., Zamore, M., Vallon-Christersson, J. et al. Multiomic profiling of ER-positive HER2-negative breast cancer reveals markers associated with metastatic spread. Breast Cancer Res 28, 12 (2026). https://doi.org/10.1186/s13058-025-02173-9
The project focuses on identifying molecular features associated with lymph-node and distant metastasis by adopting a multi-omics approach which integrates proteomics, phosphoproteomics, transcriptomics and clinical data.
According to recent statistics, Breast Cancer is the most commonly diagnosed cancer and a leading cause of cancer-related mortality among women. In the past few decades, considerable advances were made in molecular characterization of the disease and early detection, contributing to improved patient stratification and reduced mortality. However, despite these advances, metastatic disease continues to represent the main cause of cancer-related deaths. In this work, a multi-omics approach was adopted for the molecular profiling of primary breast tumor tissues and identification of potential subtypes and markers associated with metastatic processes.
This repository enables:
- Reproducible analysis of the collected multi-omics data
- Exploration of consensus clustering and survival analyses
- Reproduction of key figures and results from the publication
.
├── Dockerfile # Container for reproducible execution
├── README.md
├── docker-compose.yaml # Docker compose for running the pipeline
├── entrypoint.sh # Entrypoint script
├── notebooks/ # R notebooks for analysis and visualization
├── renv/ # R environment management files
├── renv.lock # Locked package versions
└── .gitignore
Using the Docker container is recommended for full reproducibility
- Clone the repository
git clone https://github.com/ComputationalProteomics/BreastCancerMultiomics.git- Download the data from PRIDE (PXD059920)
diann_search_results_phosphoproteomics.zip
diann_search_results_proteomics.zip
-
7zip might be required to extract only the required files for the analysis
On macOS:
brew install p7zip
On Linux (Ubuntu/Debian):
sudo apt install p7zip-full
-
Extract the data files
mkdir -p BreastCancerMultiomics/data/proteomics/{full,phospho}
7z e diann_search_results_phosphoproteomics.zip "230815_report.pr_matrix.tsv" -oBreastCancerMultiomics/data/proteomics/phospho/
7z e diann_search_results_proteomics.zip "230814_report.pg_matrix.tsv" "230814_report.pr_matrix.tsv" -oBreastCancerMultiomics/data/proteomics/full/- Pull the docker image
docker pull ghcr.io/computationalproteomics/breastcancermultiomics/multiomics_analysis:latest- Run the pipeline
PUID=$(id -u) PGID=$(id -g) docker compose -f BreastCancerMultiomics/docker-compose.yaml run --rm analysis_pipelineIf you use this repository or its contents, please cite:
Mosquim Junior, S., Zamore, M., Vallon-Christersson, J. et al. Multiomic profiling of ER-positive HER2-negative breast cancer reveals markers associated with metastatic spread. Breast Cancer Res 28, 12 (2026). https://doi.org/10.1186/s13058-025-02173-9