Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Please follow these steps:
* Install `aria2c`. On most Linux distributions it is available via the
package manager as the `aria2` package (on Debian-based distributions
this can be installed by running `sudo apt install aria2`).
Same for `rsync`.
Same for `rsync` and `parallel` (`sudo apt install rsync parallel`).

* Please use the script `scripts/download_all_data.sh` to download and set
up full databases. This may take substantial time (download size is 556
Expand Down Expand Up @@ -150,7 +150,7 @@ Please follow these steps:

### Genetic databases

This step requires `aria2c` to be installed on your machine.
This step requires `aria2c` and `rsync` to be installed on your machine. `GNU Parallel` is also highly recommended to speed up the unzipping process.

AlphaFold needs multiple genetic (sequence) databases to run:

Expand Down Expand Up @@ -731,6 +731,7 @@ and packages:
* [Docker](https://www.docker.com)
* [HH Suite](https://github.com/soedinglab/hh-suite)
* [HMMER Suite](http://eddylab.org/software/hmmer)
* [GNU Parallel](https://www.gnu.org/software/parallel/)
* [Haiku](https://github.com/deepmind/dm-haiku)
* [JAX](https://github.com/google/jax/)
* [Kalign](https://msa.sbc.su.se/cgi-bin/msa.cgi)
Expand Down
7 changes: 6 additions & 1 deletion scripts/download_pdb_mmcif.sh
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,12 @@ rsync --recursive --links --perms --times --compress --info=progress2 --delete -
"${RAW_DIR}"

echo "Unzipping all mmCIF files..."
find "${RAW_DIR}/" -type f -iname "*.gz" -exec gunzip {} +
if command -v parallel >/dev/null 2>&1
then
find "${RAW_DIR}/" -type f -iname "*.gz" -print0 | parallel -0 -j -1 --xargs gunzip
else
find "${RAW_DIR}/" -type f -iname "*.gz" -exec gunzip {} +
fi

echo "Flattening all mmCIF files..."
mkdir --parents "${MMCIF_DIR}"
Expand Down