Question

Tophat Rna-Seq Mapping Quality

1

Entering edit mode

11.6 years ago

siyu ▴ 150

if a read can be mapped to more than one location in reference genome,so-called mulit-read , what's the read's mapping quality reported by TopHat ??

how about BWA?

thanks so much?

tophat rna-seq mapping quality • 12k views

ADD COMMENT • link updated 5.5 years ago by Biostar 20 • written 11.6 years ago by siyu ▴ 150

score 1 · Answer 1 · 2013-09-05

1

Entering edit mode

11.2 years ago

predeus ★ 2.1k

This actually does not seem to hold upon checking (or at least not anymore). In Tophat 1.4.1, there is no mapping quality of 2. I've looked at a large number of BAM files and it seems that the qualities are as follows:

255 = unique mapping
3 = maps to 2 locations in the target
1 = maps to 3-4 locations
0 = maps to 5 or more locations (up to the number defined in "--prefilter-multihits").

cheers

ADD COMMENT • link 11.2 years ago by predeus ★ 2.1k

0

Entering edit mode

Interesting, thanks for doing the legwork!

ADD REPLY • link 11.2 years ago by Mikael Huss 4.8k

Ram · Answer 2 · 2013-04-23

0

Entering edit mode

11.6 years ago

Mikael Huss 4.8k

For Tophat:

From http://seqanswers.com/forums/showthread.php?t=10624:

255 = unique mapping
3 = maps to 2 locations in the target
2 = maps to 3 locations
1 = maps to 4-9 locations
0 = maps to 10 or more locations.

I think this is correct, it fits with what I have observed.

ADD COMMENT • link updated 4.9 years ago by Ram 44k • written 11.6 years ago by Mikael Huss 4.8k

0

Entering edit mode

Thank you so much !

ADD REPLY • link 11.6 years ago by siyu ▴ 150

Ram · Answer 3 · 2014-01-09

0

Entering edit mode

10.8 years ago

zju.whw ▴ 70

My tophat output bam file is using 50 to indicate the unique mapping, is there any official materials about the mapping quality of unique mapped reads?

In tophat website, it says that "most of the optional SAM fields (AS, MD, NM, and etc.) generated by Bowtie 2 are now reported by TopHat as well (reconstructed as necessary)".

And in bowtie2 website, it says that "Mapping quality: higher = more unique".

However, I cannot find the exact mapping quality number for unique mapped reads

ADD COMMENT • link updated 5.1 years ago by Ram 44k • written 10.8 years ago by zju.whw ▴ 70

0

Entering edit mode

As mentioned above the unique mapping should be scored 255. I don't know why your BAM file has score of 50 for unique alignments. Did you generate this BAM file or you got it from someone. Sometimes people cap the maximum mapping quality to 50 because some downstream analysis softwares throw error if they find a mapping score of 255.

ADD REPLY • link 10.8 years ago by Ashutosh Pandey 12k

0

Entering edit mode

I don't know why. The BAM file is got by myself, using the default tophat paramters. However, I have not found any side-effect so far. Thank you for your reply.

ADD REPLY • link 10.8 years ago by zju.whw ▴ 70

0

Entering edit mode

I've never fully trusted the mapping quality scores, so when I am interested in uniquely mapped reads I filter based on the SAM tags. I know that for TopHat v1.4.1 onwards (I haven't looked at earlier versions), the number of mappings for each read is reported in the "NH" tag, so you can write a simple script to only keep reads with this tag set to 1.

ADD REPLY • link 10.8 years ago by Chris Cabanski ▴ 330

0

Entering edit mode

Yes, you're right. There are at least two ways to get the uniquely mapped, NH:i:1 tag and mapping quality scores. I have made a mistake that I thought the NH:i:1 is in a constant column. But it is actually not. So it takes more time to identify the column number of the NH:i tag. And the bam file should be conversed to sam file, and converse back after filtering. On the other hand, you can use samtools view -bq to get the unique mapped reads. It is faster and easier. What is your scripts? What is your opinion

ADD REPLY • link updated 5.1 years ago by Ram 44k • written 10.8 years ago by zju.whw ▴ 70

0

Entering edit mode

I agree that you can get uniquely reads quicker using samtools, assuming that you are confident in the mapping qualities. I was looking at metrics beyond counting/extracting the uniquely aligned reads, and I was using output from several different aligners, so I wrote a perl script for this task. This script is available as part of a software package at http://sourceforge.net/projects/rnaseqvariantbl/ (see extract_unique.pl).