Adding annotation file param to bigDIANNtoMSstatsFormat#12
Adding annotation file param to bigDIANNtoMSstatsFormat#12Rudhik1904 wants to merge 1 commit intodevelfrom
Conversation
📝 WalkthroughWalkthroughThe changes introduce an optional Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant bigDIANNtoMSstatsFormat as bigDIANNtoMSstatsFormat()
participant reduceBigDIANN as reduceBigDIANN()
participant cleanDIANNChunk as cleanDIANNChunk()
participant MSstatsImport as MSstatsImport()
participant MSstatsClean as MSstatsClean()
participant MSstatsMakeAnnotation as MSstatsMakeAnnotation()
participant writeChunkToFile as .writeChunkToFile()
User->>bigDIANNtoMSstatsFormat: input_file, annotation, ...
bigDIANNtoMSstatsFormat->>reduceBigDIANN: input_file, annotation, ...
reduceBigDIANN->>cleanDIANNChunk: chunk_data, annotation, ...
cleanDIANNChunk->>MSstatsImport: chunk_data
MSstatsImport-->>cleanDIANNChunk: imported_data
cleanDIANNChunk->>MSstatsClean: imported_data
MSstatsClean-->>cleanDIANNChunk: cleaned_data
cleanDIANNChunk->>MSstatsMakeAnnotation: cleaned_data, annotation
MSstatsMakeAnnotation-->>cleanDIANNChunk: annotated_data
cleanDIANNChunk->>writeChunkToFile: annotated_data, output_path
writeChunkToFile-->>cleanDIANNChunk: success
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
R/converters.R (2)
168-184:⚠️ Potential issue | 🔴 CriticalBreaking change: inserting
annotationbeforeoutput_file_namebreaks positional callers.Existing code calling
bigDIANNtoMSstatsFormat(input_file, output_file_name, backend, ...)positionally will now silently assign the output file path toannotationand the backend string tooutput_file_name. Moveannotation = NULLafter the required positional parameters (or at least afterbackend) to preserve backward compatibility.Suggested fix
-bigDIANNtoMSstatsFormat <- function(input_file, - annotation = NULL, - output_file_name, +bigDIANNtoMSstatsFormat <- function(input_file, + output_file_name, backend, + annotation = NULL, MBR = TRUE,
187-191: 🛠️ Refactor suggestion | 🟠 MajorUse
file.pathfor intermediate file path construction.
paste0("reduce_output_", output_file_name)will break whenoutput_file_namecontains directory components (e.g.,"results/output.csv"→"reduce_output_results/output.csv"). Usefile.path(dirname(...), paste0(..., basename(...)))instead.Suggested fix
- reduceBigDIANN(input_file, - paste0("reduce_output_", output_file_name), + intermediate_file <- file.path(dirname(output_file_name), + paste0("reduce_output_", basename(output_file_name))) + reduceBigDIANN(input_file, + intermediate_file, MBR, quantificationColumn, global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff, annotation)The same fix should be applied on line 195 where the intermediate path is read back. Based on learnings, intermediate file paths should be constructed using
file.path(dirname(output_file_name), paste0("prefix_", basename(output_file_name)))to avoid path issues across different working directories.
🧹 Nitpick comments (2)
tests/testthat/test-clean_DIANN.R (2)
6-39: Good test coverage for the annotation path; consider adding aNULLannotation test.The mock-based test correctly validates that
annotationflows through toMSstatsMakeAnnotationand that its output reaches.writeChunkToFile.Consider adding a companion test for the default
annotation = NULLcase to ensure the no-annotation path doesn't regress — especially important given thatMSstatsMakeAnnotationis called unconditionally.
4-4:context()is deprecated in testthat 3rd edition.You can simply remove this line;
test_thatdescriptions are sufficient for grouping.
Motivation and Context
In the previous issue, we created an MSstats big converter for DIANN, but we missed captured annotation information so we are doing that in this story.
Changes
Adding Annotation file param to bigDIANNtoMSstatsFormat
Testing
Created test-clean_DIANN.R
Motivation and Context
The existing MSstats converter for DIANN data files did not capture annotation information. This pull request extends the DIANN conversion pipeline to support optional annotation files, enabling users to add experimental metadata (such as conditions and biological replicates) during the conversion process. The annotation capability is propagated through the chunked data processing workflow to maintain support for large out-of-memory datasets.
Detailed Changes
R/clean_DIANN.R
annotation = NULLparameter toreduceBigDIANN()function signatureannotation = NULLparameter tocleanDIANNChunk()function signaturediann_chunk()nested function to pass theannotationparameter tocleanDIANNChunk()MSstatsMakeAnnotationto the@importFrom MSstatsConvertdeclaration incleanDIANNChunk()documentation@param annotationdocumentation to both function Roxygen blocksMSstatsMakeAnnotation(input, annotation)call incleanDIANNChunk()afterMSstatsClean()to merge annotation data with cleaned DIANN data prior to file outputR/converters.R
annotation = NULLparameter tobigDIANNtoMSstatsFormat()function signature, positioned afterinput_fileparameterreduceBigDIANN()to include and pass through theannotationargumenttests/testthat/test-clean_DIANN.R
MSstatsImport,MSstatsClean,MSstatsMakeAnnotation, and.writeChunkToFileMSstatsMakeAnnotationis called exactly once with correct arguments (cleaned data and annotation)Unit Tests
A new test file
tests/testthat/test-clean_DIANN.Rwas added with a single test case:test_that("cleanDIANNChunk passes annotation to MSstatsMakeAnnotation")- Verifies the annotation parameter is correctly propagated through the chunk processing pipeline using mocked dependencies and assertion checks on function call counts and arguments.Coding Guidelines
The code follows the existing style conventions in the repository. However, there is a pre-existing inconsistency in the codebase regarding function definition syntax:
cleanDIANNChunk = function(...)uses the assignment operator=instead of the more idiomatic<-. This same pattern is also present incleanSpectronautChunkin the same file. While the new code maintains consistency with the existing pattern in its immediate context, this diverges from the general R style guide which recommends using<-for assignments. The R package does not have a configured linter or explicit style guide in its repository, and the Bioconductor CI workflow does not enforce style checks.