Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
78 changes: 78 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
*.manifest
*.spec

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# IDE
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store

# Project specific
examples/*/results/
blind_prediction/
successed_prediction/
failed_prediction/
multimer_prediction/
test_data/
msa_folder/
structures_all.csv

# Foldseek
pdb*
pdb_*
tmp/
8 changes: 0 additions & 8 deletions Data/Fold-switch_hits-SPEACH_AF/pdb_pairs.csv

This file was deleted.

53 changes: 0 additions & 53 deletions Install/install_colabbatch_linux_101624.sh

This file was deleted.

4 changes: 4 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
include README.md
include LICENSE.md
recursive-include cf_random/data *
recursive-include examples *
51 changes: 13 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,27 @@
# Data and code for CF-random
General installation and usage guidance of CF-random for predicting the alternative conformation and fold-switching proteins.<br>
To run CF-random in a Colab notebook, please use following [link](https://colab.research.google.com/drive/16pD2tUMkUx1gwDxZXcSr9WOosYp0ZU6j?authuser=0).<br><br>

General installation and usage guidance of CF-random for predicting the alternative conformation and fold-switching proteins.

To run CF-random in a Colab notebook, please use following [link](https://colab.research.google.com/drive/16pD2tUMkUx1gwDxZXcSr9WOosYp0ZU6j?authuser=0).

<a target="_blank" href="https://colab.research.google.com/drive/16pD2tUMkUx1gwDxZXcSr9WOosYp0ZU6j?authuser=0">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open CF-random Colab"/>
</a>


# Installation
CF-random uses the [localcolabfold](https://github.com/YoshitakaMo/localcolabfold) and [Foldseek](https://github.com/steineggerlab/foldseek) under linux environment.<br>
For more details about localcolabfold, please visit [here.](https://github.com/YoshitakaMo/localcolabfold) <br>
We currently not support the Windows and MacOS environment.<br>

Installation process including localcolabfold, dependencies, and Foldseek is done with following commands.
```
wget https://raw.githubusercontent.com/YoshitakaMo/localcolabfold/main/install_colabbatch_linux.sh
bash install_colabbatch_linux.sh

** Or use a bash script in install folder
bash install_colabbatch_linux.sh
```
<br>


After the installation of localcolabfold, add the localcolabfold path to your .bashrc file:.<br>
```
export PATH="/path/to/your/localcolabfold/colabfold-conda/bin:$PATH"
```
<br>
CF-random uses [ColabFold](https://github.com/sokrypton/ColabFold) (for structure prediction) and [Foldseek](https://github.com/steineggerlab/foldseek) (for structure search) under Linux environment.

Then reactivate your .bashrc file <br>
**For installation details, see [INSTALL.md](INSTALL.md)**

Now create a conda new conda environment:
```
conda create --name CF-random python=3.10
conda activate CF-random
pip install textalloc tmtools adjustText thefuzz mdtraj biopython seaborn MDAnalysis
conda install conda-forge::pymol-open-source
pip3 install -U scikit-learn
```
Once the dependencies are installed, install Foldseek.
<br>
```
Quick start:
```bash
conda create --name cf-random python=3.10 -y
conda activate cf-random
pip install -e ".[colabfold]"
conda install -c conda-forge -c bioconda foldseek
foldseek databases PDB pdb tmp
```
<br>

### We recommend running the foldseek databases command in a directory where the libraries can be stored. <br>


# Usage
Expand All @@ -65,7 +40,7 @@ foldseek databases PDB pdb tmp
--type #### | can choose the model type of Colabfold. e.g.) ptm, monomer, and multimer
--options ### | AC: predicting alternative conformations of protein with references, FS: predicting the fold-switching protein with references, and blind: predicting the alternative conformations or fold-switching proteins without reference PDB files.
```
* In default mode (fold-switching and alternative conformation), CF-ramdon produces the results of TM-scores (csv and png files), plDDT, and information of selected random MSA. If CF-random predicts the both folds, generated prediction files are deposited under successed_prediction/pdb1_name and additional_sampling/pdb1_name . If not, it would not generate anything. <br>
* In default mode (fold-switching and alternative conformation), CF-random produces the results of TM-scores (csv and png files), plDDT, and information of selected random MSA. If CF-random predicts the both folds, generated prediction files are deposited under successed_prediction/pdb1_name and additional_sampling/pdb1_name . If not, it would not generate anything. <br>
* Before running the default mode of fold-switching, setting the "range_fs_pairs_all.txt" file is required. The name of reference PDB files, residue ranges of reference pdb files, and residue ranges of prediction files. ColabFold generates the residue index starting from 1, so please choose the residue range of fold-switching region correctly. CF-random reads the residue index in PDB file, make sure that selection of residue range is correct. <br>
examples) pdb1, pdb2, XXX-XXX, XXX-XXX, XXX-XXX, XXX-XXX (residue range of reference 1, residue range of reference 2, residue range of prediction1, resodie range of prediction2) <br>
* --nMSA can be applied for all options, but --nESN cannot be used for blind mode.
Expand Down
18 changes: 18 additions & 0 deletions cf_random/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
"""
CF-random: Predicting alternative conformations and fold-switching proteins

A package for identifying and analyzing protein fold-switching and alternative conformations
using AlphaFold predictions and structural analysis tools.
"""

__version__ = "0.1.0"
__author__ = "Myeongsang (Samuel) Lee"
__all__ = [
"main",
]

# Import main modules for easier access
try:
from .core import main
except ImportError:
pass
Loading