I have a data set of paired-end samples which I'm mapping with bowtie2.
I am mainly focus on the unmapped reads, as these are the reads we're interested in.
This is a sample output of the mapping results from on of my runs:
11216394 reads; of these: 11216394 (100.00%) were paired; of these: 8079466 (72.03%) aligned concordantly 0 times 2987422 (26.63%) aligned concordantly exactly 1 time 149506 (1.33%) aligned concordantly >1 times ---- 8079466 pairs aligned concordantly 0 times; of these: 417642 (5.17%) aligned discordantly 1 time ---- 7661824 pairs aligned 0 times concordantly or discordantly; of these: 15323648 mates make up the pairs; of these: 15226093 (99.36%) aligned 0 times 63924 (0.42%) aligned exactly 1 time 33631 (0.22%) aligned >1 times 32.13% overall alignment rate
Out of this output, how many unmapped reads do I have?
From my understanding I was thinking it is the fourth row from the bottom = 15226093?
If I calculate all three options in this summary i have aligned once - (29874222) + (417642 *2) + 63924 = 6874052 aligned >1 - (1495062) + 33631 = 332643 unaligned - 15226093 sum = 6874052 + 332643 + 15226093 = 22432788
This last number is the sum of my input reads of this sample.
But When I look at the number of reads in the
unmapped.fastq file after the mapping it corresponds to the third row of the file (x2) = 8079466 x2 = 16158932?
But than adding the mapped and unmapped reads together give me more reads, than the there were in total.
Can anyone help me understand what is happening here?
How do I calculate the number of unmapped reads in a paired-end mapped sample?