bimodal distribution of bam mapping quality
2
0
Entering edit mode
8.1 years ago

I have some RNA-Seq data that I analyzed with tophat2. The command that I used to generate it is

/path/to/tophat-2.1.0/tophat -p 20 -o ouputdir --library-type fr-firststrand /reference/homo_sapiens/GRCh38/ensembl/Sequence/Bowtie2Index/Homo_sapiens.GRC38 my_trimmed_data.fq.gz

This output a file, accepted_hits.bam with 50e6 aligned reads. When I plot histogram of the mapping quality scores, roughly 27e6 reads have a mapping quality value [0-3] and 23e6 reads have a mapping quality value of 50. There are _no_ values in between.

How should I interpret this bimodal distribution of mapping quality scores? This seems very strange to me.

alignment tophat RNA-Seq • 2.4k views
ADD COMMENT
3
Entering edit mode
8.1 years ago
John 13k

This is normal :)

Tophat2 Mapping Qualities

ADD COMMENT
1
Entering edit mode
8.1 years ago

Besides Tophat, where intermediate mapq is avoided by design, the mapping quality from other aligners (e.g. bwa) tends to be markedly bimodal. I.e. reads tend to be either very unambiguously mapped somewhere or they can map equally well at multiple places.

ADD COMMENT

Login before adding your answer.

Traffic: 1980 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6