Skip to content

Counts are sometimes higher than coverage (circRNA specific?) #43

@Raif-fl

Description

@Raif-fl

In Marine's output, it is sometimes possible for the counts to be greater than the coverage. To see an example of this behaviour, we can run marine on a small example dataset using the following commands:

module load marine

python /tscc/projects/ps-yeolab3/kflanagan/MARINE-1.0.9-alpha/marine.py --bam_filepath /tscc/lustre/ddn/scratch/kflanagan/Tau_marine_run/one_circ_files/one_circ.bam --output_folder ./results --strandedness 0 --cores 16 --contigs "chr1:145631664|145706593" --bedgraphs "CT" --paired_end --num_intervals_per_contig 2 --keep_intermediate_files

If we look at the final_filtered_site_info.tsv output file, we will see that the edit site at position 74928
has a count of 16 while the coverage is only 11.

Screenshot 2024-09-19 at 3 55 33 PM

This glitch is rare, happening only one other time in the example at position 15509, and does not seem to have a significant effect on downstream analysis. I am unsure of what causes this bug, but this result should be impossible.

I will note that this bug was encountered as part of an ongoing effort to apply MARINE to circular RNA datasets that have been passed through the ciriQUANT pseudo-reference alignment pipeline, and thus may be specific to circular RNA examples.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions