- Optimized especially for targeted nanopore sequencing data. (Repeat size estimation is available for all long-read sequencing data.)
- Using the Google Colab platform requires a personal Google account.
- Using Linux CLI requires GPU for basecalling. (Google Colab offers GPU.)
- Input :
POD5,FAST5, orFASTQfiles - Output : (1) Sequencing QC, (2) Repeat size estimation, (3) Repeat structure, (4) Methylation profiling
Installation takes about 1–3 minutes.
$ git clone https://github.com/ChangLabSNU/RepeatLab.git
$ cd RepeatLab
$ conda env create -f environment.yml
$ conda activate repeatlab
$ cd RepeatLab/pre-requisites
$ snakemake --cores all
Write down belows in config.yml.
- Raw data directory path
- Sample name and target gene
$ cd ../RepeatLab
$ snakemake --cores 1
Test run is available with NA03697 DNA nanopore sequencing data in test-data/.
It takes about 5–10 minutes for test run.
Download the test data to your own Google Drive and follow the instructions for Google Colab-based RepeatLab.
Since the input data information is already written in config.yml, just follow the instructions above.
You can find the report file at analyses/NA03697_test-DMPK/report.html after test run.
If you encounter any errors using RepeatLab, please report the trouble issues at Issues.
Han, Y., Jang, J. H., & Chang, H. (2026). Targeted long-read sequencing for high-resolution repeat profiling in myotonic dystrophy type 1. Experimental & Molecular Medicine, 1-13.
@article{han2026targeted,
title={Targeted long-read sequencing for high-resolution repeat profiling in myotonic dystrophy type 1},
author={Han, Yoojung and Jang, Ja-Hyun and Chang, Hyeshik},
journal={Experimental \& Molecular Medicine},
pages={1--13},
year={2026},
publisher={Nature Publishing Group UK London}
}