This actually does not seem to hold upon checking (or at least not anymore). In Tophat 1.4.1, there is no mapping quality of 2. I've looked at a large number of BAM files and it seems that the qualities are as follows:
cheers
For Tophat:
From http://seqanswers.com/forums/showthread.php?t=10624:
255 = unique mapping
3 = maps to 2 locations in the target
2 = maps to 3 locations
1 = maps to 4-9 locations
0 = maps to 10 or more locations.
I think this is correct, it fits with what I have observed.
My tophat output bam file is using 50 to indicate the unique mapping, is there any official materials about the mapping quality of unique mapped reads?
In tophat website, it says that "most of the optional SAM fields (AS, MD, NM, and etc.) generated by Bowtie 2 are now reported by TopHat as well (reconstructed as necessary)".
And in bowtie2 website, it says that "Mapping quality: higher = more unique".
However, I cannot find the exact mapping quality number for unique mapped reads
As mentioned above the unique mapping should be scored 255. I don't know why your BAM file has score of 50 for unique alignments. Did you generate this BAM file or you got it from someone. Sometimes people cap the maximum mapping quality to 50 because some downstream analysis softwares throw error if they find a mapping score of 255.
I've never fully trusted the mapping quality scores, so when I am interested in uniquely mapped reads I filter based on the SAM tags. I know that for TopHat v1.4.1 onwards (I haven't looked at earlier versions), the number of mappings for each read is reported in the "NH" tag, so you can write a simple script to only keep reads with this tag set to 1.
Yes, you're right. There are at least two ways to get the uniquely mapped, NH:i:1 tag and mapping quality scores.
I have made a mistake that I thought the NH:i:1 is in a constant column. But it is actually not. So it takes more time to identify the column number of the NH:i tag. And the bam file should be conversed to sam file, and converse back after filtering.
On the other hand, you can use samtools view -bq
to get the unique mapped reads. It is faster and easier.
What is your scripts? What is your opinion
I agree that you can get uniquely reads quicker using samtools, assuming that you are confident in the mapping qualities. I was looking at metrics beyond counting/extracting the uniquely aligned reads, and I was using output from several different aligners, so I wrote a perl script for this task. This script is available as part of a software package at http://sourceforge.net/projects/rnaseqvariantbl/ (see extract_unique.pl).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Interesting, thanks for doing the legwork!