Tophat2 Mapping Qualities
1
4
Entering edit mode
8.5 years ago

When I run Tophat2 on human genome, I am using default parameters and with a GTF file from ensemble.

The accepted hits bam file has only three Mapping quality scores in 5th column in my file i.e 0, 3, 50. I would like to know is this correct or I have done something wrong.

As per my knowledge, mapping quality of 0 means the read mapped at multiple places but for other reads, I could see only 3 and 50. The data is from Illumina HiSeq 1000, paired end library.

I would be happy if anybody could help me in figuring out why I am getting only three figures .

tophat2 bowtie2 • 8.8k views
0
Entering edit mode

you may also be interested in checking different option for these in tophat2 like -g 1

6
Entering edit mode
8.5 years ago

Your results are normal, the MAPQ scores reported by tophat2 are not related to -10*log10(probability the mapping is wrong). It's 50 for uniquely mapped, and then 0-3 for various degrees of multiple mapping.

3
Entering edit mode

255 = unique mapping

3 = maps to 2 locations in the target

2 = maps to 3 locations

1 = maps to 4-9 locations

0 = maps to 10 or more locations.

0
Entering edit mode

50 means unique mapping for tophat2 ??

2
Entering edit mode

Yeah, they changed from 255 to 50 at some point. I have no clue which release had the change.

0
Entering edit mode

dpryan79 may be right with that. I haven't used the latest version of tophat2. May be now they score uniquely aligned reads with 50 MAPQ. One reason could be that some downstream tools like GATK complain when they see a MAPQ of 255.

0
Entering edit mode

Is this documented anywhere?

0
Entering edit mode

Not that I know of. To make life slightly more complicated, the scores used to be 0-3 and 255, so don't be surprised if you see that if you have older datasets.

2
Entering edit mode

I see quality scores of mostly 3 and 50 in recent tophat (2.0.13) but also of 41, 42, 44 and 24, 28

2
Entering edit mode

Gotta love undocumented changes.