GATK variantFiltration unexpected behaviour?
1
0
Entering edit mode
4.4 years ago
jtwalker ▴ 20

Hello,

I'm trying to use the GATK to hard filter variants (I'm using a non-model organism, so variant re calibration isn't possible), but, while I'm following the GATK website's tutorial (https://gatkforums.broadinstitute.org/gatk/discussion/2806/howto-apply-hard-filters-to-a-call-set), I haven't actually been able to filter out any variants.

As an example, after subsetting out the SNP's in my GenotypeGVCFs produced VCF file, I used

gatk -T VariantFiltration -R /PATH/reference_genome -V myfile.vcf  --filterExpression "MQ>20" --filterName "mq20_filter"  -o my_filtered_file.vcf

which should have flagged any variants with mapping quality below 20 with FILTER rather than PASS. When I use grep to check if this worked in the way that I expected, I found many instances where MQ=10.

There seems to be something missing with my JEXL expression that I'm simply not seeing in the tutorial, or other GATK documentation. Can anyone help me out?

Thanks!

edit: fixed a minor mistake

SNP GATK variant filtration • 1.9k views
ADD COMMENT
0
Entering edit mode

show us a line in the VCF failing the expression, and, in the header, the line for ##INFO=<ID=MQ... please.

ADD REPLY
0
Entering edit mode

A line from the VCF:

NC_024218.1 10250670    .   C   T   61.23   PASS    AC=2;AF=0.333;AN=6;DP=75;ExcessHet=1.0474;FS=0.000;MLEAC=2;MLEAF=0.333;MQ=10.95;QD=30.62;SOR=0.693  GT:AD:DP:GQ:PGT:PID:PL  0/0:6,0:6:18:.:.:0,18,230   ./.:1,0:1:.:.:.:0,0,0   1/1:0,2:2:6:1|1:10250670_C_T:90,6,0 ./.:2,0:2:.:.:.:0,0,0   ./.:10,0:10:.:.:.:0,0,0 ./.:0,0:0:.:.:.:0,0,0   ./.:1,0:1:.:.:.:0,0,0   ./.:17,0:17:.:.:.:0,0,0 ./.:9,0:9:.:.:.:0,0,0   0/0:9,0:9:0:.:.:0,0,258 ./.:0,0:0:.:.:.:0,0,0   ./.:2,0:2:.:.:.:0,0,0   ./.:2,0:2:.:.:.:0,0,0   ./.:1,0:1:.:.:.:0,0,0   ./.:0,0:0:.:.:.:0,0,0   ./.:13,0:13:.:.:.:0,0,0

and I'm not quite sure what you mean by a line from the header, but I grepped your example and what returned was this:

##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">

Thanks!

ADD REPLY
0
Entering edit mode

I've encountered the same issue. Could you please provide your solution if you were able to fix this problem?

ADD REPLY
2
Entering edit mode
4.4 years ago

Isn't there a syntax problem here? I thought that GATK's filter expressions couldn't take integers, and they needed doubles, otherwise it'd throw an error.

It should be:

--filterExpression "MQ > 20.0"
ADD COMMENT
0
Entering edit mode

It didn't give me an error, but I'll try that now. Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 3133 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6