Structural Changes in Gene Ontology Reveal Modular and Complex Representations of Biological Function
This repository contains the data, code, and supplementary materials for the manuscript:
Valverde S., et al. (2025). Structural Changes in Gene Ontology Reveal Modular and Complex Representations of Biological Function. Molecular Biology and Evolution
Read the paper here: https://academic.oup.com/mbe/article/doi/10.1093/molbev/msaf148/8159018
This study uses network-based methods to analyze fifteen years of Gene Ontology (GO) evolution. We show that GO evolves not only through incremental growth, but also via curator-driven restructuring. In particular, we document a major semantic modularization in the Cellular Component branch in 2019, aligning GO with frameworks such as CARO and GO-CAM. Our results highlight the need for version-aware, multi-layer representations of ontological knowledge in computational biology.
gene-ontology-evolution/
├── data/ # Network files
│ ├── go_versions/ # Individual GO snapshots (in OBO graph format)
│ ├── processed_networks/ # Processed network data (in Pajek graph format)
├── scripts/ # Python scripts used in the analysis
├── results/ # Figures and summary outputs
├── README.md # Project overview (this file)
├── LICENSE # Open source license
Figures can be reproduced using the scripts provided in the /scripts folder and the input data under /data. For instructions:
- Clone this repository
- Ensure dependencies are installed (Python ≥ 3.8, NetworkX, Matplotlib, etc.)
• Code: MIT License
• Data: CC-BY 4.0 International
See LICENSE file for details.
For questions, please contact:
Sergi Valverde Evolution of Networks Lab, Barcelona s.valverde@csic.es