How To Check That Tophat Works Well
1
0
Entering edit mode
8.8 years ago
M K ▴ 590

I used tophat to map my RNAseq reads to human genome hg19 (My data is single-end) and I got some files in the tophat output folder. I need to know how many reads mapped to the reference genome and how many don't mapped and the percentage of overall alignment.

BTW, I used the following samtools command

        samtools flagstat accepted_hits.bam


and It gave me the following results:

  175638490 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
175638490 + 0 mapped (100.00%:-nan%)
0 + 0 paired in sequencing
0 + 0 properly paired (-nan%:-nan%)
0 + 0 with itself and mate mapped
0 + 0 singletons (-nan%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)


Any explanation of the zeros above

tophat • 2.7k views
0
Entering edit mode
8.8 years ago
Martombo ★ 3.0k

have a look at the file align_summary.txt, you cannot look at the mapping statistics of accepted_hits.bam because it contains only the mapped reads (the unmapped are in unmapped.bam). from what you wrote, you can only deduce that 175638490 of your reads are mapped. all the zeros you see refer to paired end statistics, since your data is not paired they are all zero

0
Entering edit mode

Thanks Martombo. Could you please tell me when I can find align_summary.txt.

0
Entering edit mode

it should be in the output folder of tophat

0
Entering edit mode

Hi Martombo, Could you please tell me where this file exactly located becuse I didn't find it in my output folder.