Skip to content

Conversation

@kornkv
Copy link
Contributor

@kornkv kornkv commented Dec 8, 2025

draft of FASTQ_REMOVE_ADAPTERS_AND_MERGE subworkflow with tests

mirpedrol and others added 25 commits December 8, 2025 17:00
* add ontologies to tcoffee regressive

* add ontologies to upp align
* Add module pbmarkdup

* Fix linting

* Update path to test data

* Update with code review (--dup-file, log, check file name collisions)

* Fix linting

* Update path to test data

* Update modules/nf-core/pbmarkdup/meta.yml

* Fix linting
* Enable complex contrast strings

* Update docker image

* Add test case with limma contrast string

* Format changes and add test with shrinkage
* Add deepvariant optional html

* update snapshot

* Update modules/nf-core/deepvariant/rundeepvariant/main.nf

Co-authored-by: Ramprasad Neethiraj <20065894+ramprasadn@users.noreply.github.com>

* trigger html generation

* revert config change

---------

Co-authored-by: Ramprasad Neethiraj <20065894+ramprasadn@users.noreply.github.com>
HISAT2 uses .ht2l extension instead of .ht2 for large genomes.
Updated index detection to match both extensions.

Related to nf-core/rnaseq#1643

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
…erministic (#9489)

* Sort file listing so "first" file is deterministic

* Declare closure parameter per strict syntax

Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>

---------

Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>
Co-authored-by: mashehu <mashehu3@gmail.com>
* sambamba add region bed input

* fix linting

* fix linting

* Apply suggestions from code review

Co-authored-by: Felix Lenner <52530259+fellen31@users.noreply.github.com>

---------

Co-authored-by: Felix Lenner <52530259+fellen31@users.noreply.github.com>
* fix fasta_index_methylseq and fastq_align_dedup workflows for clarity and consistency

- Updated variable names in fasta_index_methylseq to use 'channel' instead of 'Channel' for consistency.
- Renamed UNTAR to UNTAR_BISMARK and UNTAR_BWAMETH for clarity in fasta_index_methylseq.
- Enhanced comments and descriptions in meta.yml files for better understanding of input and output structures.
- Adjusted test cases in fastq_align_dedup workflows to reflect changes in input structure from single-end to paired-end.
- Updated version numbers in test snapshots to reflect recent changes.

* fix: pre-commit lint fixes
* Update glimpse

* Update chunk

* Update concordance

* Revert changes

* Fix glimpse test

* Fix glimpse

* Fix glimpse2 tests

* Update sbwf

* Remove old snapshots

* Update glimpse

* Update modules/nf-core/glimpse2/concordance/tests/main.nf.test

Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>

* Update test

---------

Co-authored-by: LouisLeNezet <louislenezet@gmaio.com>
Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>
)

* fix order

* add container section

* simplify schema

* require https for singluarity
* update and add topics

* add new topics structure

* add stub test and capture version in snapshot

* update to 9.14.0

* fix singularity be setting cache_dir

* fix stub

---------

Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>
Co-authored-by: mashehu <mashehu3@gmail.com>
* fix stub version

* stray module
- Add missing gene_id_col parameter definition (defaults to 'gene_id')
- Include gene IDs as first column in all results tables using configurable column name
- Only write output files when there are significant results to avoid empty files
- Mark all results TSV outputs as optional since they're conditionally created
- Update test to use buffering results instead of empty mRNA_abundance results
- Update test snapshots with new file formats including gene_id column

This ensures anota2seq results are consistent with other modules and include
gene identifiers for downstream analysis, while gracefully handling cases
where no genes pass significance thresholds.

Co-authored-by: Sebastian Uhrig <suhrig@users.noreply.github.com>
#9516)

fix(decoupler): reorder imports and ensure environment variables are set before importing modules
Add strdrop build
…9201)

* 🔧 update image and bioconda container to latest version

* ✅ update test snapshots

* 🐛 fix display of version of vuegen

- had no command line interface option to display version, see
Multiomics-Analytics-Group/vuegen#167

* 🎨 display versions.yml content in snapshots

* 🔧 add Dockerfile to install lastet PyPI vuegen version

- does not pass hadolint(er) as of now

* 🚧 add wave containers

* 🔥 remove README again

* 🔥 remove Dockerfile again

* 🚧 try to follow Mahesh's advice

* 🐛 add explicit cache directory

* 🔧 bump to Python 3.12 and remove channel prefix

* 🔧 specify singularity image with https

... as specified in the docs:
https://nf-co.re/docs/tutorials/nf-core_components/using_seqera_containers

* 🚧 set user specified R libarary folder

* ⏪ make docker and conda work again (using nf-core 3.5.1)

* 🔧 switch again to custom docker image instead of wave

- wave leads to too many custom installation issues

* 🐛 try to add font package

* 🔥 remove code moved to image

- singularity runs in devcontainer

* ⏪ add back conda quarto flag

* 🎨 remove trailing whitespace

* 🎨 format again

* 🎨 hopefully the last trailing whitespace

* 📝 document the build process and why the container is needed

* Update image with nf-core one

Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>

* Update container name

Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>

* Apply suggestion from @mashehu

* Apply suggestion from @mashehu

---------

Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>
Co-authored-by: nf-core-bot <core@nf-co.re>
* Update semibin2 module

* Update snapshot

* Remove unneeded snapshot section

---------

Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>
LouisLeNezet and others added 7 commits December 12, 2025 08:21
* Update glimpse2 sbwf

* Update test

* Add region to beagle5

* Add subworkflow

* Fix linting

* Fix linting

* Fix linting

* Update subworkflows/nf-core/vcf_impute_beagle5/main.nf

Co-authored-by: Nicolas Vannieuwkerke <101190534+nvnieuwk@users.noreply.github.com>

* Add comment

* Update grouping and test

* Remove tag

* Revert change glimpse2 reference

* Revert change glimpse2 sbwf

* Revert change glimpse2 sbwf

* Revert change glimpse2 sbwf

---------

Co-authored-by: LouisLeNezet <louislenezet@gmaio.com>
Co-authored-by: Nicolas Vannieuwkerke <101190534+nvnieuwk@users.noreply.github.com>
* Add vcf_impute_minimac4

* Update linting

* Update test

* Fix linting

* Update minimac4 sbwf

* Remove tag

* Remove tag

* Fix linting

* Add comment

* Update snapshot

* Fix nf-test
…subworkflow (#9559)

Add BBSplit stats to MultiQC files in fastq_qc_trim_filter_setstrandedness

Pass BBSplit stats output to MultiQC for visualization of read binning
statistics. MultiQC 1.33+ includes support for parsing BBSplit stats.txt
files and displaying per-sample read distribution across reference genomes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* proper stub for gz and stub test added

* topic output syntax and tests update

* meta yml updated with topics and ontologies

* meta file curated

* version bump to latest

* update nf-tests properly

* adding self to maintainers

* removed Z flag as is deprecated after v4.10

* conda bug with different pre-built python version fixed
…ved cutadapt versions, since it is now ported to topics
@vagkaratzas vagkaratzas requested a review from miraep8 December 12, 2025 15:01
LouisLeNezet and others added 8 commits December 15, 2025 16:09
)

* Standarize and alignment

* Fix glimpse2 sbwf test

* Fix test

* Add comment

* Update snapshot

---------

Co-authored-by: LouisLeNezet <louislenezet@gmaio.com>
* Remove .view()

* Remove unecessary tags
* testing solo trim-galore container, without adding extra cutadapt and pigz

* Syntax updates and topic version for manta modules (#9556)

* update manta germline

* topics convertinversion

* topics convertinversion

* topics manta/somatic

* topics manta/tumoronly

* Syntax updates and topics of jasminesv (#9554)

syntax updates and topics of jasminesv

* Update `Modkit pileup`  (#9553)

* update yaml

* update main.nf

* modified test runs

* update bedmethyltobigwig tests

* update main

* update snapshot

* fix linting

* update snapshots

* remove config

* update module_args

* [automated] Fix linting with Prettier

* changed name

* update main

---------

Co-authored-by: ra25wog <jin.khoo@campus.lmu.de>
Co-authored-by: nf-core-bot <core@nf-co.re>

* Standarize and alignment for all imputation and alignment modules (#9566)

* Standarize and alignment

* Fix glimpse2 sbwf test

* Fix test

* Add comment

* Update snapshot

---------

Co-authored-by: LouisLeNezet <louislenezet@gmaio.com>

* Update Infrastructural dependencies

* Remove .view() (#9567)

* Bump strdrop to 0.3.1 (#9565)

* Remove unecessary tags (#9568)

* Remove .view()

* Remove unecessary tags

* latest container, with cutadapt 5.2

* new output syntax, nf-tests updated, meta updated

* meta yml lint fixed

* trying to fix lint

* lint fix with nf-core tools 3.6.0dev

* removing TRIMGALORE versions output from the FASTQ_FASTQC_UMITOOLS_TRIMGALORE subworkflow

---------

Co-authored-by: Nicolas Vannieuwkerke <101190534+nvnieuwk@users.noreply.github.com>
Co-authored-by: Jinn <155078830+jkh00@users.noreply.github.com>
Co-authored-by: ra25wog <jin.khoo@campus.lmu.de>
Co-authored-by: nf-core-bot <core@nf-co.re>
Co-authored-by: Louis Le Nézet <58640615+LouisLeNezet@users.noreply.github.com>
Co-authored-by: LouisLeNezet <louislenezet@gmaio.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Felix Lenner <52530259+fellen31@users.noreply.github.com>
@pinin4fjords
Copy link
Member

Note for reviewers: setup-nextflow deletion is intentional cleanup

The diff shows deletion of a setup-nextflow submodule entry - this is correct and beneficial, not an error.

Background

The removal commit by @vagkaratzas (9239334ea - "remove unused folder") correctly cleans this up.


take:
reads // channel: [ val(meta), [ reads ] ]
skip_trimmomatic // boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't leave any notes in the PR description, so I'm not sure. Is it your intention that folks would want to run multiple trimming methods, which is why you've avoided a simpler trimmer param or similar, as fastq_qc_trim_filter_setstrandedness does (for example)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, the idea is to select 1 or more adapter removal (and/or merge) tools. This subworkflow is going to be part of the mega fastq QC/preprocessing subworkflow described here.
Potentially, with pipeline chaining, this could one day become its own standalone pipeline that would be executed before all fastq short read pipelines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing with @jfy133, it seems this subworkflow will act as a plug-in interface, that will hide the complexity of the underlying tools from the user, while letting them pick their favourite adapter removal tool of choice.
i.e., one selected tool, and not chaining tools.
Updating accordingly..

skip_cutadapt // boolean
skip_trimgalore // boolean
skip_bbduk // boolean
contaminants // channel: [ reads ] // fasta, adapters to remove
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this benefit from metas? I think we normally require those these days (not all of the modules/ subworkflows have caught up)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point in time, I just sourced the inputs in the format that they are being used by the included modules. Will update accordingly when and if the module inputs are updated at module side ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My instinct was with Jon, but fair enough. We could try and push pipeline develoeprs to update teh original module by requiring meta here (even if we drop it in the subworkflow)

fastp_save_trimmed_fail // boolean
save_merged // boolean
skip_adapterremoval // boolean
text_adapters // channel: [ txt ] // adapters to remove, in adapterremoval text format
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, think this could do with a meta

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder a bit how it fits into the landscape of other trimming/ preprocessing workflows, but nothing particularly objectionable

@vagkaratzas
Copy link
Contributor

I wonder a bit how it fits into the landscape of other trimming/ preprocessing workflows, but nothing particularly objectionable

Thanks! I'll wait for @jfy133 for a final review pass and will then merge.
Stay tuned for the following updates :D

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments - @vagkaratzas and I spoke on Slack in more detail about what I had in my mind when I suggested this subworkflow vs what is implemented here.

I can see why Jon was a bit confused about the purpose of the subworkflow, hopefully it'll be clearer based on the refactor we discussed on Slack

if (!skip_cutadapt) {
CUTADAPT( ch_reads )
ch_reads = CUTADAPT.out.reads
ch_multiqc_files = ch_multiqc_files.mix(CUTADAPT.out.log)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing cut adapt versions mixing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do that for modules that have been updated to the new version output syntax ;)
As in, it's kinda impossible without the pieces of code at the end of the new pipeline's template.
@maxulysse for back-up!

ch_reads = TRIMGALORE.out.reads
ch_discarded_reads = ch_discarded_reads.mix(TRIMGALORE.out.unpaired)
ch_trimgalore_log = TRIMGALORE.out.log
ch_trimgalore_html = TRIMGALORE.out.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing timmomatic versions mixing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

)
ch_discarded_reads = ch_discarded_reads.mix(ADAPTERREMOVAL_SE.out.discarded, ADAPTERREMOVAL_PE.out.discarded)
ch_adapterremoval_paired_interleaved = ADAPTERREMOVAL_SE.out.paired_interleaved.mix(ADAPTERREMOVAL_PE.out.paired_interleaved)
ch_versions = ch_versions.mix(ADAPTERREMOVAL_SE.out.versions.first(), ADAPTERREMOVAL_PE.out.versions.first())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work if only one of the two modules are run? I would rather put a dedicated mix after each module invocation independently

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, seemed to be working fine during tests

}

if (!skip_adapterremoval) {
ch_adapterremoval_in = ch_reads
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another Question here just to double check, do all the other tools not care how they receive their data (i.e., can handle easily both)?

The reason why we have it separate for ADAPTEREMOVAL is more about the mkixing the output channels more than any thing else

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! that's how it seems on the tests where nothing is skipped for both single and paired ends

@vagkaratzas vagkaratzas requested a review from jfy133 December 19, 2025 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.