Hi, I am having some issues with the VCF files generated from GATK caller, as they are not returning a mapping quality value for many positions, specially invariant sites.
Since the BAM files have a mapping quality score on every read, I am assuming that there is a way to get that value for every position without needing to use GATK. What are some alternatives? In a case where multiple samples are being used, do these MQ should be simply be averages between samples at every position?
In case you wonder how I am using GATK, I post relevant code below:
java -jar gatk HaplotypeCaller -I file.bam -O file.g.vcf -R reference.fa -ploidy 1 -ERC BP_RESOLUTION # The above is done for different input files java -jar gatk CombineGVCFs -R reference.fa -O combined.g.vcf --variant file1.g.vcf --variant file2.g.vcf ... java -jar gatk GenotypeGVCFs -R reference.fa -V combined.g.vcf -O variants.vcf -ploidy 1 -all-sites
For some reason, this results in many MQ values being absent from the final VCF file (as well as many QUAL values taking an Infinity value).