I've aligned my pair-end 100nt reads to the reference using BWA and called SNPs using GATK. When I looked into the mapping quality distribution of the SNPs and into its relation to Ts/Tv, I found that low (<4) and high (>50) MQ corresponds to low Ts/Tv. I understand that low MQ implies that there might be mapping errors, hence it makes sense that there will be more variant calling errors, and Ts/Tv is lower. But what would be the explanation for low Ts/Tv with high MQ?
My search tells me that BWA overestimates mapping qualities, but this still doesn't explain why Ts/Tv decreases with high MQ values (instead of being lower throughout the range of MQ values, correct?).
So, my questions are:
- Does GATK directly report the MQ value coming from BWA? If not, what exactly is reported?
- Anyone has any idea why Ts/Tv decreases (and, supposedly, SNP calling error rate increases) with high MQ values?
Thank you, Ines.