Conversation
|
@nf-core-bot fix linting |
There was a problem hiding this comment.
Pull request overview
Adds a new doublet-detection option to the nf-core/scdownstream pipeline by integrating an scDblFinder-based method as a local module, exposing it via the existing --doublet_detection parameter and wiring it into the doublet detection subworkflow.
Changes:
- Added a new local Nextflow module
SCDBLFINDER(conda env, container, R implementation, nf-test coverage). - Integrated
scdblfinderinto theDOUBLET_DETECTIONsubworkflow and pipeline test configs. - Updated user-facing documentation and citations to include scDblFinder.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| subworkflows/local/doublet_detection/main.nf | Adds SCDBLFINDER execution path and mixes its outputs into predictions/versions channels. |
| nextflow_schema.json | Extends doublet_detection allowed methods and help text to include scdblfinder. |
| modules/local/doublet_detection/scdblfinder/main.nf | New Nextflow process definition for the scDblFinder module. |
| modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R | Implements scDblFinder execution, multiplet-rate estimation, and prediction export. |
| modules/local/doublet_detection/scdblfinder/environment.yml | Defines conda dependencies for the new module. |
| modules/local/doublet_detection/scdblfinder/meta.yml | Adds module metadata, I/O definitions, and tool citation details. |
| modules/local/doublet_detection/scdblfinder/tests/main.nf.test | Adds nf-test coverage for real and stub runs. |
| modules/local/doublet_detection/scdblfinder/tests/main.nf.test.snap | Adds test snapshots for the module outputs. |
| conf/modules.config | Adds publishDir/prefix handling for SCDBLFINDER outputs. |
| conf/test.config | Enables scdblfinder in the test profile’s doublet_detection methods. |
| conf/test_full.config | Enables scdblfinder in the full test profile’s doublet_detection methods. |
| docs/output.md | Documents scDblFinder as a supported doublet detection tool and output folder. |
| README.md | Lists scDblFinder among supported doublet detection tools. |
| CITATIONS.md | Adds scDblFinder citation entry. |
| CHANGELOG.md | Notes addition of scDblFinder module. |
| ro-crate-metadata.json | Updates embedded project description to include scDblFinder in the tool list. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
…der.R Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
nictru
left a comment
There was a problem hiding this comment.
Looks good in general. I did not look through the R code in detail yet (will do that in the evening)
|
@nictru want me to change all the doublet detection tools to use |
|
You can also do it in this PR, I am not very strict about this |
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
|
Also please include it here |
- Added optional `doublet_rate` column in input samplesheet for per-sample expected doublet rate in `scDblFinder`. - Updated `scDblFinder` to utilize internal `dbr` estimation when `doublet_rate` is not provided. - Modified input and output handling in `SCDBLFINDER` process to accommodate new `doublet_rate` parameter. - Updated relevant documentation including CHANGELOG, README, and usage examples to reflect changes. - Added tests to validate functionality with provided `doublet_rate`.
|
@nf-core-bot fix linting |
modules/local/doublet_detection/scdblfinder/templates/scdblfinder.R
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
… improve readability
There was a problem hiding this comment.
I think things start to clear up. For you to better understand my rationale:
Generally all modules in the pipeline follow the same schema:
- Read input
- Apply function
- Write output
So that the pipeline modules are simple wrappers around the R/python functions.
Deviations from this schema (additional manipulations etc) are sometimes necessary but require clear justification.
If you look at how complicated the R script was in the beginning vs what it is now, this becomes apparent. Helps with keeping overview, finding bugs, and everything else
| sce <- scDblFinder( | ||
| assays(sce)[[1]], | ||
| BPPARAM = bp, | ||
| dbr = dbr | ||
| ) |
There was a problem hiding this comment.
I found that scDblFinder has an optional samples argument, which is highly relevant here. You should pass the batch_col to it, which you can get via the ch_batch_col in the doublet detection subworkflow
| # Write the updated SingleCellExperiment directly as h5ad, explicitly mapping the | ||
| # primary assay to AnnData X so downstream readers see a valid matrix field. | ||
| primary_assay <- assayNames(sce)[1] | ||
| if (is.na(primary_assay) || primary_assay == "") { | ||
| stop("scDblFinder output is missing a primary assay; cannot write h5ad output.") | ||
| } |
There was a problem hiding this comment.
I think this is also unnecessary because the sce is created from the anndata in a way that is perfectly prepared for reversing the process
PR checklist
This comment contains a description of changes (with reason).
added an adaptation of my implementation of scDblFinder 's regular random sampling-based doublet detection method as a module to the pipeline. (If this works fine, we can add the clustering-dependent method too as soon as tomorrow)
It can be run using the settings --doublet_detection scdblfinder
It also takes a new optional parameter
doublet_ratevia a columnin the samplesheet with an expected float doublet rate between 0 and 1. If not provided,scDblFinderestimates it internally.If you've fixed a bug or added code that should be tested, add tests!
Super basic tests to run without errors were added to match the other R-based modules
If you've added a new tool - have you followed the pipeline conventions in the contribution docs
Done!
If necessary, also make a PR on the nf-core/scdownstream branch on the nf-core/test-datasets repository.
No need, works with the usual fitered testdataset that other doublet detection tools use
Make sure your code lints (
nf-core pipelines lint).It does
Ensure the test suite passes (
nextflow run . -profile test,docker --outdir <OUTDIR>).**My module runs fine within the pipeline tests, quit after I saw it complete, retest needed on your end @nictru as you mentioned in the Slack thread **
Check for unexpected warnings in debug mode (
nextflow run . -profile debug,test,docker --outdir <OUTDIR>).**Didn't bother with this either as per the same thread in Slack by @nictru **
Usage Documentation in
docs/usage.mdis updated.Output Documentation in
docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).