Hi everyone,
I have performed variant calling and SNP extraction + filtration with GATK. After I filtered my SNPs, I noticed my log file was filled with the following warnings:
01:40:43.493 WARN JexlEngine - ![0,14]: 'ReadPosRankSum < -8.0;' undefined variable ReadPosRankSum
01:40:43.493 WARN JexlEngine - ![0,9]: 'MQRankSum < -12.5;' undefined variable MQRankSum
For this particular sample, I gathered the number of times this warning appeared for ReadPosRankSum and MQRankSum via grep. ReadPosRankSum = 1189 and MQRankSum = 1169.
With grep, I verified that these numbers match the number of SNP entries missing these categories in the INFO column.
A visual aid of what I am describing from two different extracted SNPs (for clarity, I'm only showing the INFO column):
AC=2;AF=1.00;AN=2;DP=85;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=58.76;QD=31.06;SOR=1.179
AC=2;AF=1.00;AN=2;BaseQRankSum=1.834;DP=8;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=26.28;MQRankSum=1.834;QD=10.66;ReadPosRankSum=1.834;SOR=3.258
Due to these missing fields in the INFO column, I'm worried my hard filtering step will be flawed.
Can anyone explain why these would be missing from some SNP entries and not others? I'm happy to post any supporting info that might help with a solution.
In advance, thank you for the help!