I used TopHat to map my reads against their relative reference genome.
When I look inside prep_reads.info, I see:
- left_reads_in =24995053
- right_reads_in =24995053
Then when I open align_summary.txt, I see:
Mapped: 22715900 (90.9% of input)
of these: 2106892 ( 9.3%) have multiple alignments (89 have >20)
Mapped: 22310498 (89.3% of input)
of these: 2088630 ( 9.4%) have multiple alignments (148 have >20)
90.1% overall read alignment rate.
Aligned pairs: 21074559
of these: 1469415 ( 7.0%) have multiple alignments
and: 107380 ( 0.5%) are discordant alignments
83.9% concordant pair alignment rate.
In align_summary.txt I know the changes between "Input" number and "Mapped" is because some of reads are unmapped to reference genome. ^Ok^.
But for prep_reads.info I do not know
why "_reads_out" numbers are different from "_reads_in" numbers and If this difference is due to unmapped reads, why the difference is not equal to difference between the Input number and Mapped number in align_summary.txt?