Question: Why Tophat Does Not Map Some Reads
gravatar for alteralex
5.3 years ago by
alteralex40 wrote:

Hi, I am trying to use Tophat for my RNA-seq analysis. I tested my PE reads following three protocols.

1, no reference gene annotation (GTF file) I noticed some PE reads are correctly mapped. The flags are 83 and 163. See an example as the following:

M01339:30:000000000-A42G7:1:1101:15690:1356 163 chr8 126142440 0 32M = 126142503 96TACAGCACCCGGTATTCCCAGGCGGTCTCCCA $$$$$%%%&&&&"$%&(((((('%'( AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:32 YT:Z:UU NH:i:9 HI:i:8

M01339:30:000000000-A42G7:1:1101:15690:1356 83 chr8 126142503 0 33M = 126142440 -96GCTTCCGAGATCAGACGAGATCGGGCGCGTTCA '''&''((%'$'''#'''#&&#&"&&$'&'$"" AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:33 YT:Z:UU NH:i:9 HI:i:8

2, I provide GTF file downloaded from iGenome. The other parameters are the same, but then some of the PE reads lost its pairing in the mapping:

M01339:30:000000000-A42G7:1:1101:15690:1356 89 chr8 126142503 0 33M * 0 0 GCTTCCGAGATCAGACGAGATCGGGCGCGTTCA '''&''((%'$'''#'''#&&#&"&&$'&'$"" AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:33 YT:Z:UU NH:i:20 HI:i:19

Please notice that here the read that once mapped to chr8:126142440 lost its mapping!

3, I added another option, "--library-type fr-unstranded". This time, the two mapping are both gone! The following is found in the umapped file.

M01339:30:000000000-A42G7:1:1101:15690:1356 69 * 0 255 * * 0 0 TGAACGCGCCCGATCTCGTCTGATCTCGGAAGC ""$'&'$&&"&#&&#'''#'''$'%((''&'''

M01339:30:000000000-A42G7:1:1101:15690:1356 133 * 0 255 * * 0 0 TACAGCACCCGGTATTCCCAGGCGGTCTCCCA $$$$$%%%&&&&"$%&(((((('%'(

Anyone could give me some insights? Thank you in advance!

gtf tophat • 1.5k views
ADD COMMENTlink modified 5.3 years ago • written 5.3 years ago by alteralex40
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1647 users visited in the last hour