Skip to content

Question: How does slimm deal with discordant mapping of paired end reads? #38

@your-highness

Description

@your-highness

Dear @temehi & @agakrawczyk ,

@agakrawczyk and me ran in problems when using bowtie2 and slimm:

  1. Yara, bowtie2 and slimm indizes with Human chr1 & chr11 and C-RVDB (https://rvdb.dbi.udel.edu/) were built.
  2. An in silico data set comprising 91% Human chr1 & chr11 reads and 9% Human virus reads of various species was generated and mapped with bowtie2 or yara.
  3. Slimm was used for abundance estimation for bowtie2 or yara mappings.

While "yara+slimm" gave consistent results for all assayed viruses, "bowtie2+slimm" did fail for two viruses. Closer inspection on mapping files showed that bowtie2 reported many discordant mapping across various reference sequences:

MK630134.1_50_0 97      KY315545.1      7713    1       301M    KY315552.1      8791    0       CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG   CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/*   AS:i:-2 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:281C19     YT:Z:UP
MK630134.1_50_0 353     KY290183.1      7394    1       301M    KY315552.1      8791    0       CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG   CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/*   AS:i:-2 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:281C19     YT:Z:UP
MK630134.1_50_0 353     KY316048.1      11168   1       301M    KY315552.1      8791    0       CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG   CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/*   AS:i:-2 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:281C19     YT:Z:UP
MK630134.1_50_0 353     KY274508.1      7501    1       301M    KY315552.1      8791    0       CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG   CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/*   AS:i:-2 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:281C19     YT:Z:UP
MK630134.1_50_0 353     MH698400.1      15286   1       301M    KY315552.1      8791    0       CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG   CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/*   AS:i:-2 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:281C19     YT:Z:UP
MK630134.1_50_0 353     KY315552.1      7711    1       301M    =       8791    0       CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG   CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/*   AS:i:-2 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:281C19     YT:Z:UP

Yara did not report these discordant mappings.

My question: How does slimm deal with discordant mappings? My suspection is that these are discarded.

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions