Question: GATK VariantFiltration does not filter [SOLVED]
gravatar for romain.coppee
5 months ago by
romain.coppee0 wrote:

Hi everyone,

I'm trying to use GATK to hard filter variants but, while I'm following the GATK website's tutorial, I haven't actually been able to filter out any variants. I want to filter based on the VQSLOD flag.

I tried to perform the filtering using GATK 3.8 or 4.1, but systematically, any variants is filtered. I have no error output.

GATK 3.8 version:

java -jar /home/maintenance-gg/Téléchargements/GenomeAnalysisTK-3.8-1-0-gf15c1c3ef/GenomeAnalysisTK.jar \
-T VariantFiltration \
-R /home/maintenance-gg/Documents/Reference_genome/Pfalciparum.genome.fasta \
--filterName LowQualVQ -filter "VQSLOD <= 0.0" \
--variant /home/maintenance-gg/Documents/VCF2/SNPs.vcf \
-log /home/maintenance-gg/Documents/VCF2/filtration.txt \
-o /home/maintenance-gg/Documents/VCF2/SNP_filtered5.vcf

GATK 4.1 version:

gatk VariantFiltration \
-R /home/maintenance-gg/Documents/Reference_genome/Pfalciparum.genome.fasta \
-V /home/maintenance-gg/Documents/VCF2/calling_GVCF.vcf \
--filter-name LowQualVQ -filter "VQSLOD <= 0.0" \
-O /home/maintenance-gg/Documents/VCF2/SNP_filtered5.vcf

Can anyone help me out? I have search for a solution on biostars and GATK support, but I don't found a solution to my problem... I just know that GATK's filter expressions couldn't take integers, and they needed doubles.

Here are the INFO line of VQSLOD and an example SNP line of my VCF before filtration.


##INFO=<ID=VQSLOD,Number=1,Type=Float,Description="Log odds of being a true variant versus being false under the trained gaussian mixture model">

SNP example:

Pf3D7_01_v3 176 .   G   A   107.14  PASS    AC=2;AF=1.00;AN=2;DP=5;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=34.35;QD=26.79;SOR=3.258;VQSLOD=3.39;culprit=SOR GT:AD:DP:GQ:PL  1/1:0,4:4:12:121,12,0   ./.:0,0 ./.:6,0:6   ./.:0,0 ./.:5,0:5   ./.:0,0 ./.:0,0 ./.:3,0:3   ./.:20,0:20 ./.:0,0 ./.:8,0:8   ./.:0,0 ./.:5,0:5   ./.:0,0 ./.:0,0 ./.:0,0 ./.:0,0 ./.:0,0

Please let me know if you need any additional information.


ADD COMMENTlink modified 5 months ago • written 5 months ago by romain.coppee0
gravatar for romain.coppee
5 months ago by
romain.coppee0 wrote:

Thank you for your reply ;)

Harold, the PASS was due to another filter.

However, I understood my problem. The annotation was correctly made with VariantFiltration, but the next step was to apply the filtering (and I did not perform it).

For this, I used the following GATK command :

gatk SelectVariants \
-R /home/maintenance-gg/Documents/Reference_genome/Pfalciparum.genome.fasta \
-V /home/maintenance-gg/Documents/VCF2/SNP_filtered.vcf \
-O /home/maintenance-gg/Documents/VCF2/SNP_filtered2.vcf \
-select 'vc.isNotFiltered()'

The -select flag allows to conserve only variants that passed my criteria.

Thank a lot for your help guys ;)


ADD COMMENTlink written 5 months ago by romain.coppee0
gravatar for Pierre Lindenbaum
5 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum134k wrote:

isn't it the reverse logic ? The variant will be flagged as LowQualVQ if "VQSLOD <= 0.0" ? can you please try with "VQSLOD > 0.0"

ADD COMMENTlink written 5 months ago by Pierre Lindenbaum134k

Hi, thanks for your reply. The name of my filter is wrong, you're right; but even if I execute the command as you propose, there is no difference, any variants were filtered.

ADD REPLYlink written 5 months ago by romain.coppee0
gravatar for harold.smith.tarheel
5 months ago by
United States
harold.smith.tarheel4.6k wrote:

From the command description:

"A filtered VCF in which passing variants are annotated as PASS and failing variants are annotated with the name(s) of the filter(s) they failed."

In the SNP example you posted, it's annotated as 'PASS' - as it should be.

ADD COMMENTlink written 5 months ago by harold.smith.tarheel4.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1731 users visited in the last hour