Calculate average mapping quality at a position
1
1
Entering edit mode
4.3 years ago
hpapoli ▴ 130

Hello,

I want to filter my non-variant positions for mapping quality. I used mpileup to output the mapping qualities across all sites. To do the filtering, can I take an arithmetic mean of the mapping qualities for each read that the base belong to?

For example:

NW_008793873.1 13 G 2 .^S. AB HS


Position 13 is covered by two reads with mapping qualities H and S. Would (39 + 50)/2 = 44.5 be correct?

I ask this because in connection to mapping qualities, usually root mean square is mentioned so I was wondering what would the correct approach be in this case?

Thank you!

mapping quality mpileup • 2.3k views
0
Entering edit mode

aren't you mixing up MAPQ mapping qualities and read qualities ?

1
Entering edit mode

I think this is using "samtools mpileup -s" which outputs the base quality followed by the mapping qualities for the read that supports the base.

0
Entering edit mode

got it , thanks

5
Entering edit mode
4.3 years ago
Gabriel R. ★ 2.8k

These are probabilities of mismapping on a PHRED scale. For the first one, the probability of mismapping is:

(10^(-(39/10)) = 0.0001258925


For the second it is:

10^(-(50/10)) =  1e-05


So on average, your probability of mismapping is:

(0.0001258925+1e-05)/2 =  6.794627e-05


On a PHRED scale it is:

 -10*log10(6.794625e-05) =  41.67835

0
Entering edit mode

Thanks! By the way, do you know what kind of probability distribution do mapping qualities have?

1
Entering edit mode

it depends on the aligner but it in any case, it's a bit of a scam: https://sequencing.qcfail.com/articles/mapq-values-are-really-useful-but-their-implementation-is-a-mess/ The link above has a plot of the distribution of mapping qualities.