Question: Calculate average mapping quality at a position
hpapoli70 wrote:

Hello,

I want to filter my non-variant positions for mapping quality. I used mpileup to output the mapping qualities across all sites. To do the filtering, can I take an arithmetic mean of the mapping qualities for each read that the base belong to?

For example:

``````NW_008793873.1 13 G 2 .^S. AB HS
``````

Position 13 is covered by two reads with mapping qualities H and S. Would (39 + 50)/2 = 44.5 be correct?

I ask this because in connection to mapping qualities, usually root mean square is mentioned so I was wondering what would the correct approach be in this case?

Thank you!

modified 15 months ago by Gabriel R.2.6k • written 15 months ago by hpapoli70

aren't you mixing up MAPQ mapping qualities and read qualities ?

I think this is using "samtools mpileup -s" which outputs the base quality followed by the mapping qualities for the read that supports the base.

got it , thanks

Gabriel R.2.6k wrote:

These are probabilities of mismapping on a PHRED scale. For the first one, the probability of mismapping is:

``````(10^(-(39/10)) = 0.0001258925
``````

For the second it is:

``````10^(-(50/10)) =  1e-05
``````

So on average, your probability of mismapping is:

``````(0.0001258925+1e-05)/2 =  6.794627e-05
``````

On a PHRED scale it is:

`````` -10*log10(6.794625e-05) =  41.67835
``````

Thanks! By the way, do you know what kind of probability distribution do mapping qualities have?

it depends on the aligner but it in any case, it's a bit of a scam: https://sequencing.qcfail.com/articles/mapq-values-are-really-useful-but-their-implementation-is-a-mess/ The link above has a plot of the distribution of mapping qualities.