Question: Why the file unmapped.bam, from tophat has all reads with mapq =/> 30?
0
gravatar for tiago211287
4.5 years ago by
tiago2112871.1k
USA
tiago2112871.1k wrote:

Used the samstat tool (http://samstat.sourceforge.net/) for seeing the mapq scores of my alignment;

First, I used with the accepted_hits.bam, and found some reads classified as unmmaped. And, I thought strange. Shouldn't the accepted hits contain only mapped reads?

Second, When I use the samstat against unmapped reads.bam, all reads are 30 + mapq scored. 

Can someone explain this to me?

mapq samstat tophat • 1.9k views
ADD COMMENTlink modified 4.5 years ago by geek_y10k • written 4.5 years ago by tiago2112871.1k

Can u post the output of

samtools view unmapped.bam | cut -f5 | sort | uniq -c 
ADD REPLYlink written 4.5 years ago by geek_y10k

[tiagocastro@tucunare test]$ samtools view unmapped.bam | cut -f5 | sort | uniq -c 

 

756288 255

 

ADD REPLYlink written 4.5 years ago by tiago2112871.1k

I found strange that my accepted_hits has unmapped reads. Also, The unmapped.bam has 30+ mapq.

ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by tiago2112871.1k
4
gravatar for geek_y
4.5 years ago by
geek_y10k
Barcelona
geek_y10k wrote:

As the following command gave you 756288 255, this indicates all the reads in unmapped.bam have mapping quality of 255, which indicates that the mapping quality could not be assigned to them, hence they are all can be considered as unmapped. If you run the same command on accepted_hits.bam, you should see MAPQ of 0,1,3,50.

samtools view unmapped.bam | cut -f5 | sort | uniq -c

You can also run the following command and see if all the reads have reference name as *. This also indicates the reads mapped to no chromosomes.

samtools view unmapped.bam | cut -f3 | sort | uniq -c
ADD COMMENTlink modified 4.5 years ago • written 4.5 years ago by geek_y10k

No need to sort in the second command.

ADD REPLYlink written 4.5 years ago by geek_y10k

Can You tell me how can be possible that, all MAPQ in all bases of all reads be the same? like this plot on the samstat are saying? 

http://s16.postimg.org/rqs9id1cl/Untitled2.png

Saw here: http://bioinfo.cipf.es/courses/mda13genomics/_media/mda13:map_qc.pdf

 that , "The counts and proportions should be almost invariant accross read positions"

ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by tiago2112871.1k

More one thing, shouldn't the accepted_hits.bam have 0 unmapped reads? Must have something wrong with my alignment. as you can see here:

there are 0.5% unmapped reads on the accepted_hits.bam : http://s28.postimg.org/wunusnl71/Untitled.png

ADD REPLYlink written 4.5 years ago by tiago2112871.1k
1

It should have 0 unmapped reads. If you run the command mentioned:

samtools view accepted_hits.bam | cut -f5 | sort | uniq -c

you should not see any reads with MAPQ 255. I never used any software for these calculations. I use only samtools with various flags.

ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by geek_y10k

MAPQ is not for all bases. its for read. All unmapped reads in tophat output will have MAPQ of 255, which indicates that the MAPQ can not be calculated for them.

ADD REPLYlink written 4.5 years ago by geek_y10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1081 users visited in the last hour