AbTune

Layer-wise selective fine-tuning of protein language models for antibodies.

AbTune is a user-friendly framework for sequence-specific and computationally efficient fine-tuning of protein language models (pLMs) on antibody datasets.

AbTune implements a layer-wise selective fine-tuning strategy, where only a subset of transformer layers are updated during adaptation. This substantially reduces computational cost while improving performance on antibody-related downstream tasks.

The framework currently supports:

Sequence representation fine-tuning with ESM2
Structure prediction with ESMFold
Conservation and mutation-effect analysis through sequence scanning

The tool accompanies the preprint:

Xu et al. AbTune: Layer-wise Selective Fine-Tuning of Protein Language Models for Antibodies. bioRxiv, 2025.

https://www.biorxiv.org/content/10.1101/2025.10.17.682998v1

Features

Layer-wise selective fine-tuning for reduced GPU memory usage
Efficient adaptation of large protein language models
Supports both sequence- and structure-level workflows
Compatible with ESM2 and ESMFold backbones
YAML-based configuration system
Automatic heavy/light chain handling
FASTA-based input pipelines
Conservation and mutation-scanning utilities

Installation

Requirements

Dependency	Version
Python	>= 3.10
biopython	1.85
fair-esm	2.0.0
numpy	1.26.4
omegaconf	2.3.0
pandas	2.2.3
torch	2.1.2

All required dependencies are installed automatically during installation.

Step 1 (Optional): Install ESMFold

If you intend to use AbTune in ESMFold mode, ESMFold must be installed separately before installing AbTune.

ESMFold requires additional dependencies, including OpenFold, which are not bundled with this package.

Follow the official installation instructions:

git clone https://github.com/facebookresearch/esm

More details:

https://github.com/facebookresearch/esm

Step 2: Install AbTune

Install from PyPI (recommended)

pip install Ab-Tune

Install from source

git clone https://github.com/haddocking/Finetune-Ab
cd Finetune-Ab
pip install .

This installs the command-line entry point:

ab-tune

Quick Start

AbTune is configured entirely through YAML configuration files.

Run a job using:

ab-tune --config configs/ESM2.yaml

You can switch running modes, datasets, and hyperparameters without modifying the source code.

Running Modes

AbTune currently supports three operating modes.

1. ESM2 Mode

Fine-tunes ESM2 models directly on antibody sequences.

Only the linear projection layers inside Multi-Head Attention (MHA) modules are updated during training, while the remainder of the model remains frozen.

This mode is useful when:

Improved antibody embeddings are required
Structure prediction is not needed
Minimal computational overhead is desired

Typical Outputs

Output	Description
Embeddings	Fine-tuned antibody sequence representations
Training logs	Optimization and loss metrics
Validation metrics	Task-specific evaluation results

Example Use Cases

Embedding extraction
Similarity analysis
Downstream machine learning pipelines
Antibody property prediction

2. ESMFold Mode

Extends the ESM2 mode to additionally perform ESMFold structure prediction using the fine-tuned backbone.

Heavy and light chain sequences are concatenated internally using a 25-residue polyglycine linker before being passed to the model.

In our experiments, inclusion of the linker consistently improved model performance.

This mode requires ESMFold to be installed separately.

Typical Outputs

Output	Description
Predicted PDB structures	Antibody structural models
Per-residue confidence scores	pLDDT-style confidence estimates
Structure inference logs	Prediction metadata
Sequence embeddings	Fine-tuned latent representations
Training logs	Optimization and loss metrics

Notes

Heavy and light chains should be linked using a 25-residue glycine linker (GGGGGGGGGGGGGGGGGGGGGGGGG)
The linker can be added automatically during preprocessing
ESMFold installation is required

Example Use Cases

Antibody structure prediction
Structural downstream analysis
Docking preparation
Conformation-sensitive modeling

3. Conservation Mode

A conservation-aware fine-tuning mode that scans protein sequences for mutation effects and residue preferences during adaptation.

This mode estimates the probability of observing each amino acid at every sequence position across fine-tuning steps.

It is useful for studying sequence conservation, mutation tolerance, and evolutionary constraints.

Typical Outputs

Output	Description
Conservation profiles	Probability distribution over amino acids at each residue position
Mutation effect scores	Predicted impact of sequence mutations
Training logs	Optimization and loss metrics

Example Use Cases

Mutation-effect prediction
Conservation analysis
Functional residue identification
Protein engineering studies

Configuration Files

All experiments are controlled through YAML configuration files.

Example configuration:

# ==============================================
#   ⚙️  ESM2 Fine-Tuning Configuration
# ==============================================

# Name of pretrained ESM model
# All models from esm.pretrained are supported
esm_model_name: esm2_t33_650M_UR50

# Running mode
# Choose one of [ESM2, ESMFold, Conservation]
running_mode: ESM2

# Directory for saving outputs
save_path: ./outputs

# Number of fine-tuning steps
steps: 50

# Layer indices for LoRA injection
# [] means all layers
inject_layers: [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

# Target module for LoRA injection
lora_target_replace_module: MultiheadAttention

# Fine-tuning steps used for scoring
score_seq_steps_list: [1,2,3,4,5]

# Target protein sequence
seq: VVKFMDVYQRSYCHPIETLVDIFQEYPDEIEYIFKPSCVPLMRCGGCCNDEGLECVPTEESNITMQIMRIKPHQGQHIGEMSFLQHNKCECRPK

# PDB identifier and chain ID
# Format: pdbid_chainid
pdbid_chainid: 1a2c_A

Important Parameters

Parameter	Description
`esm_model_name`	Name of pretrained ESM model
`running_mode`	Running mode (`ESM2`, `ESMFold`, `Conservation`)
`save_path`	Directory for output files
`steps`	Number of fine-tuning steps
`inject_layers`	Transformer layers used for selective fine-tuning
`score_seq_steps_list`	Steps used for sequence scoring
`seq`	Input amino acid sequence
`pdbid_chainid`	PDB identifier and chain

Input Format

Input sequences should be provided in FASTA format.

Example:

>antibody_1_heavy
EVQLVESGGGLVQPGGSLRLSCAAS...

For paired heavy/light chain data:

Manual linker insertion is needed

Project Structure

Finetune-Ab/
├── AbTune/                 Core Python package
├── configs/                Example configuration files
├── .github/workflows/      CI configuration
├── pyproject.toml          Package metadata
├── README.md
└── LICENSE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AbTune

Features

Installation

Requirements

Step 1 (Optional): Install ESMFold

Step 2: Install AbTune

Install from PyPI (recommended)

Install from source

Quick Start

Running Modes

1. ESM2 Mode

Typical Outputs

Example Use Cases

2. ESMFold Mode

Typical Outputs

Notes

Example Use Cases

3. Conservation Mode

Typical Outputs

Example Use Cases

Configuration Files

Important Parameters

Input Format

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
AbTune		AbTune
.gitignore		.gitignore
LICENSE		LICENSE
pyproject.toml		pyproject.toml
readme.md		readme.md

Folders and files

Latest commit

History

Repository files navigation

AbTune

Features

Installation

Requirements

Step 1 (Optional): Install ESMFold

Step 2: Install AbTune

Install from PyPI (recommended)

Install from source

Quick Start

Running Modes

1. ESM2 Mode

Typical Outputs

Example Use Cases

2. ESMFold Mode

Typical Outputs

Notes

Example Use Cases

3. Conservation Mode

Typical Outputs

Example Use Cases

Configuration Files

Important Parameters

Input Format

Project Structure

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages