Question: understanding multi-mapped reads
gravatar for Assa Yeroslaviz
2.5 years ago by
Assa Yeroslaviz1.2k
Assa Yeroslaviz1.2k wrote:

Hi everybody,

I am having a problem understanding the output of tophat concerning the multi-mapped reads. I would like to create a barplot of the unique vs. multi-mapped vs. unmapped reads from tophat-mapped bam files. (BTW is there already a tool for doing that?)

Now I wanted to extract the multi-mapped reads from one bam file. lat us say MW6. this is the align_summary.txt file I have:

          Input     :  26140314
           Mapped   :  25159791 (96.2% of input)
            of these:   1027691 ( 4.1%) have multiple alignments (1832 have >20)
96.2% overall read mapping rate.

But when I try to count the reads manually I get this :

$samtools view bamFiles/MW6.sorted.bam | grep -w -v  "NH:i:1" | wc -l

and when I use the flagstat option from samtools:

$samtools flagstat bamFiles/MW6.SE.sorted.bam
28250742 + 0 in total (QC-passed reads + QC-failed reads)
3090951 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
28250742 + 0 mapped (100.00% : N/A)
0 + 0 paired in sequencing
... (the rest is 0)

The total number of input reads is 26140314 (as it says in the align_summary file). The total mapped reads from the bam file is 28250742, as says in the samtools flagstat. I guess this conflict is due to reads which mapped multiple times, but how can I get the exact number of multiple reads?

is it just 28250742 - 26140314 = 2110428 ?

Would it be similar in paired-end and single-end mapping runs?



ADD COMMENTlink modified 2.5 years ago by Carlo Yague4.6k • written 2.5 years ago by Assa Yeroslaviz1.2k
gravatar for Carlo Yague
2.5 years ago by
Carlo Yague4.6k
Carlo Yague4.6k wrote:

The total mapped reads from the bam file is 28250742

This is the total number of alignments, which is different than the total number of reads. Reads can have 0, 1 or more alignments.

  • There are 25159791-1027691=24132100 reads that map only 1 time.
  • There are 1027691 reads that map multiple times.
  • The 24132100 uni-reads generate 24132100 alignments.
  • The 1027691 multi-reads generate 28250742-24132100=4118642 alignments including 3090951 secondary alignments.
ADD COMMENTlink written 2.5 years ago by Carlo Yague4.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 626 users visited in the last hour