-
Notifications
You must be signed in to change notification settings - Fork 969
Add MMseqs makepaddedseqdb #10239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
keiran-rowell-unsw
merged 32 commits into
nf-core:master
from
Australian-Structural-Biology-Computing:add-createpaddeddb
Feb 26, 2026
Merged
Add MMseqs makepaddedseqdb #10239
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
cf6a628
feat(init): Commit inital meta.yml file
nbtm-sh 4481c5d
feat(env): add environment file
nbtm-sh af5681b
feat(main): Add main boilerplate
nbtm-sh baaeda9
feat(ouptut): add output variables
nbtm-sh 7ec171a
feat(stub): add stub
nbtm-sh b75121d
feat(meta): update meta to include output hints
nbtm-sh b5a6383
feat(tests): add tests
nbtm-sh 067fa53
feat(test): add snapshot
nbtm-sh 8f8eb89
feat(when): add when clause
nbtm-sh 64b8f56
fix(meta): fix meta keys
nbtm-sh 5c142ac
fix(meta): fix output keys
nbtm-sh ee48763
fix(meta): fix keys
nbtm-sh 926f01c
fix(prek): fix syntax
nbtm-sh e766a2c
fix(padded): change from padded to gpu
nbtm-sh ef9c9bb
fix(spelling): cotigs -> contigs
nbtm-sh 378fb74
feat(snap): update snapshot with new paths
nbtm-sh 7705dc9
fix(alias): remove alias
nbtm-sh 48fe89b
fix(naming): change from using reserved variable name
nbtm-sh cbe569a
Update modules/nf-core/mmseqs/makepaddedseqdb/tests/main.nf.test
nbtm-sh 22230fb
Update modules/nf-core/mmseqs/makepaddedseqdb/tests/main.nf.test
nbtm-sh ab8bf25
Update modules/nf-core/mmseqs/makepaddedseqdb/tests/main.nf.test
nbtm-sh daf292c
Update modules/nf-core/mmseqs/makepaddedseqdb/main.nf
nbtm-sh eeda6d4
Update modules/nf-core/mmseqs/makepaddedseqdb/main.nf
nbtm-sh 3dd6da4
Update modules/nf-core/mmseqs/makepaddedseqdb/meta.yml
nbtm-sh c875e02
fix(output): fix db output to use set prefix
nbtm-sh fbddba5
fix(meta): fix meta yml
nbtm-sh 15af087
feat(test): add config variable to tests
nbtm-sh afd8904
fix(snapshot): update snapshot to use new variable names
nbtm-sh c034130
feat(commit): add commit to add prefix variable
nbtm-sh be26522
fix(stub): fix stub run to use prefix instead of prefix_padded
nbtm-sh cbcf09a
Merge branch 'master' into add-createpaddeddb
nbtm-sh 49beaac
feat(meta): add myself as a maintainer
nbtm-sh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| --- | ||
| # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json | ||
| channels: | ||
| - conda-forge | ||
| - bioconda | ||
| dependencies: | ||
| - bioconda::mmseqs2=18.8cc5c |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| process MMSEQS_MAKEPADDEDSEQDB { | ||
| tag "${meta.id}" | ||
| label 'process_low' | ||
| conda "${moduleDir}/environment.yml" | ||
|
|
||
| container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container | ||
| ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/fe/fe49c17754753d6cd9a31e5894117edaf1c81e3d6053a12bf6dc8f3af1dffe23/data' | ||
| : 'community.wave.seqera.io/library/mmseqs2:18.8cc5c--af05c9a98d9f6139'}" | ||
|
|
||
| input: | ||
| tuple val(meta), path(db_in) | ||
|
|
||
| output: | ||
| tuple val(meta), path("${prefix}/"), emit: db_padded | ||
| tuple val("${task.process}"), val('mmseqs'), eval('mmseqs version'), topic: versions, emit: versions_mmseqs | ||
|
|
||
| when: | ||
| task.ext.when == null || task.ext.when | ||
|
|
||
| script: | ||
| def args = task.ext.args ?: '' | ||
| def args2 = task.ext.args2 ?: '*.dbtype' | ||
| prefix = task.ext.prefix ?: "${meta.id}" | ||
| if ("${db_in}" == "${prefix}") { | ||
| error("Input and output names of databases are the same, set prefix in module configuration to disambiguate!") | ||
| } | ||
| """ | ||
| DB_TARGET_PATH_NAME=\$(find -L "${db_in}/" -maxdepth 1 -name "${args2}" | sed 's/\\.[^.]*\$//' | sed -e 'N;s/^\\(.*\\).*\\n\\1.*\$/\\1\\n\\1/;D' ) | ||
| mkdir -p ${prefix} | ||
| mmseqs \\ | ||
| makepaddedseqdb \\ | ||
| \$DB_TARGET_PATH_NAME \\ | ||
| ${prefix}/${prefix} \\ | ||
| ${args} | ||
| """ | ||
|
|
||
| stub: | ||
| def args = task.ext.args ?: '' | ||
| prefix = task.ext.prefix ?: "${meta.id}" | ||
| """ | ||
| echo ${args} | ||
| mkdir -p ${prefix} | ||
| touch ${prefix}/${prefix} | ||
| touch ${prefix}/${prefix}.dbtype | ||
| touch ${prefix}/${prefix}.index | ||
| touch ${prefix}/${prefix}.lookup | ||
| touch ${prefix}/${prefix}_h | ||
| touch ${prefix}/${prefix}_h.dbtype | ||
| touch ${prefix}/${prefix}_h.index | ||
| """ | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| name: "mmseqs_makepaddedseqdb" | ||
| description: Create an MMseqs padded database from an existing MMseqs database | ||
| keywords: | ||
| - protein sequence | ||
| - databases | ||
| - clustering | ||
| - searching | ||
| - indexing | ||
| - mmseqs2 | ||
| tools: | ||
| - "mmseqs": | ||
| description: "MMseqs2: ultra fast and sensitive sequence search and clustering | ||
| suite" | ||
| homepage: "https://github.com/soedinglab/MMseqs2" | ||
| documentation: "https://mmseqs.com/latest/userguide.pdf" | ||
| tool_dev_url: "https://github.com/soedinglab/MMseqs2" | ||
| doi: "10.1093/bioinformatics/btw006" | ||
| licence: | ||
| - "GPL v3" | ||
| identifier: biotools:mmseqs | ||
| input: | ||
| - - meta: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing sample information | ||
| e.g. `[ id:'test', single_end:false ]` | ||
| - db_in: | ||
| type: directory | ||
| description: Input of existing MMseqs database | ||
| output: | ||
| db_padded: | ||
| - - meta: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing sample information | ||
| e.g. `[ id:'test', single_end:false ]` | ||
| - "${prefix}/": | ||
| type: directory | ||
| description: The padded MMseqs2 database | ||
| versions_mmseqs: | ||
| - - ${task.process}: | ||
| type: string | ||
| description: The name of the process | ||
| - mmseqs: | ||
| type: string | ||
| description: The name of the tool | ||
| - mmseqs version: | ||
| type: eval | ||
| description: The expression to obtain the version of the tool | ||
| topics: | ||
| versions: | ||
| - - ${task.process}: | ||
| type: string | ||
| description: The name of the process | ||
| - mmseqs: | ||
| type: string | ||
| description: The name of the tool | ||
| - mmseqs version: | ||
| type: eval | ||
| description: The expression to obtain the version of the tool | ||
| authors: | ||
| - "@nbtm-sh" | ||
| maintainers: | ||
| - "@nbtm-sh" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| nextflow_process { | ||
|
|
||
| name "Test Process MMSEQS_MAKEPADDEDSEQDB" | ||
| script "../main.nf" | ||
| process "MMSEQS_MAKEPADDEDSEQDB" | ||
| tag "modules" | ||
| tag "modules_nfcore" | ||
| tag "mmseqs" | ||
| tag "mmseqs/makepaddedseqdb" | ||
| tag "mmseqs/createdb" | ||
|
|
||
| config "./nextflow.config" | ||
|
|
||
| setup { | ||
| run("MMSEQS_CREATEDB") { | ||
| script "../../../mmseqs/createdb/main.nf" | ||
nbtm-sh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| process { | ||
| """ | ||
| input[0] = [ [ id:'test_query' ], | ||
| file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fasta/contigs.fasta', checkIfExists: true) | ||
| ] | ||
| """ | ||
| } | ||
| } | ||
| } | ||
|
|
||
| test("mmseqs_db sarscov2 contigs") { | ||
|
|
||
| when { | ||
| params { | ||
| module_prefix = "test_query_gpu" | ||
| } | ||
| process { | ||
| """ | ||
| input[0] = MMSEQS_CREATEDB.out.db | ||
| """ | ||
| } | ||
| } | ||
|
|
||
| then { | ||
| assertAll( | ||
| { assert process.success }, | ||
| { assert snapshot(sanitizeOutput(process.out)).match() | ||
| } | ||
| ) | ||
| } | ||
|
|
||
| } | ||
| } | ||
36 changes: 36 additions & 0 deletions
36
modules/nf-core/mmseqs/makepaddedseqdb/tests/main.nf.test.snap
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| { | ||
| "mmseqs_db sarscov2 contigs": { | ||
| "content": [ | ||
| { | ||
| "db_padded": [ | ||
| [ | ||
| { | ||
| "id": "test_query" | ||
| }, | ||
| [ | ||
| "test_query_gpu:md5,5b24585ba92fd826c78b8664c63b4e95", | ||
| "test_query_gpu.dbtype:md5,01d39098f2bfee5c808a3b4ff54deac2", | ||
| "test_query_gpu.index:md5,5946b4989d08320d9daca503155ba693", | ||
| "test_query_gpu.lookup:md5,3eb85c645034a0717db62ef0a3da5479", | ||
| "test_query_gpu_h:md5,a9fca4931be476b8f302cc27b5dff9b0", | ||
| "test_query_gpu_h.dbtype:md5,740bab4f9ec8808aedb68d6b1281aeb2", | ||
| "test_query_gpu_h.index:md5,ce0ca30c2e57677077cc23823ef17206" | ||
|
Comment on lines
+11
to
+17
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suprised these are consistent but that's good! |
||
| ] | ||
| ] | ||
| ], | ||
| "versions_mmseqs": [ | ||
| [ | ||
| "MMSEQS_MAKEPADDEDSEQDB", | ||
| "mmseqs", | ||
| "18.8cc5c" | ||
| ] | ||
| ] | ||
| } | ||
| ], | ||
| "timestamp": "2026-02-25T10:33:19.910807101", | ||
| "meta": { | ||
| "nf-test": "0.9.4", | ||
| "nextflow": "25.04.6" | ||
| } | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| process { | ||
| withName: "MMSEQS_MAKEPADDEDSEQDB" { | ||
| ext.prefix = params.module_prefix | ||
| } | ||
| } |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.