Tophat Rna-Seq Mapping Quality
3
1
Entering edit mode
8.4 years ago
siyu ▴ 140

if a read can be mapped to more than one location in reference genome,so-called mulit-read , what's the read's mapping quality reported by TopHat ??

thanks so much?

tophat rna-seq mapping quality • 9.2k views
1
Entering edit mode
8.0 years ago
predeus ★ 1.7k

This actually does not seem to hold upon checking (or at least not anymore). In Tophat 1.4.1, there is no mapping quality of 2. I've looked at a large number of BAM files and it seems that the qualities are as follows:

• 255 = unique mapping
• 3 = maps to 2 locations in the target
• 1 = maps to 3-4 locations
• 0 = maps to 5 or more locations (up to the number defined in "--prefilter-multihits").

cheers

0
Entering edit mode

Interesting, thanks for doing the legwork!

0
Entering edit mode
8.4 years ago

For Tophat:

255 = unique mapping
3 = maps to 2 locations in the target
2 = maps to 3 locations
1 = maps to 4-9 locations
0 = maps to 10 or more locations.

I think this is correct, it fits with what I have observed.

0
Entering edit mode

Thank you so much !

0
Entering edit mode
7.7 years ago
zju.whw ▴ 50

My tophat output bam file is using 50 to indicate the unique mapping, is there any official materials about the mapping quality of unique mapped reads?

In tophat website, it says that "most of the optional SAM fields (AS, MD, NM, and etc.) generated by Bowtie 2 are now reported by TopHat as well (reconstructed as necessary)".

And in bowtie2 website, it says that "Mapping quality: higher = more unique".

However, I cannot find the exact mapping quality number for unique mapped reads

0
Entering edit mode

As mentioned above the unique mapping should be scored 255. I don't know why your BAM file has score of 50 for unique alignments. Did you generate this BAM file or you got it from someone. Sometimes people cap the maximum mapping quality to 50 because some downstream analysis softwares throw error if they find a mapping score of 255.

0
Entering edit mode

I don't know why. The BAM file is got by myself, using the default tophat paramters. However, I have not found any side-effect so far. Thank you for your reply.

0
Entering edit mode

I've never fully trusted the mapping quality scores, so when I am interested in uniquely mapped reads I filter based on the SAM tags. I know that for TopHat v1.4.1 onwards (I haven't looked at earlier versions), the number of mappings for each read is reported in the "NH" tag, so you can write a simple script to only keep reads with this tag set to 1.

0
Entering edit mode

Yes, you're right. There are at least two ways to get the uniquely mapped, NH:i:1 tag and mapping quality scores. I have made a mistake that I thought the NH:i:1 is in a constant column. But it is actually not. So it takes more time to identify the column number of the NH:i tag. And the bam file should be conversed to sam file, and converse back after filtering. On the other hand, you can use samtools view -bq to get the unique mapped reads. It is faster and easier. What is your scripts? What is your opinion

0
Entering edit mode

I agree that you can get uniquely reads quicker using samtools, assuming that you are confident in the mapping qualities. I was looking at metrics beyond counting/extracting the uniquely aligned reads, and I was using output from several different aligners, so I wrote a perl script for this task. This script is available as part of a software package at http://sourceforge.net/projects/rnaseqvariantbl/ (see extract_unique.pl).

0
Entering edit mode

thank you very much