Question: QD Annotation error in GATK in VariantRecalibration step
gravatar for win
9 months ago by
win790 wrote:

hi all, i downloaded a BAM file from 1000 genomes to work on and converted it to FASTQ and aligned with bwa mem to hg38, this worked fine.

later I wanted to run VariantRecalibration and use the following commands for SNP and they worked fine as well.

 java -Xmx8g -jar algorithms/gatk3/gatk.jar -T VariantRecalibrator -R references/hg38gatkbundle/Homo_sapiens_assembly38.fasta -input data/HG100/HG100.output.raw.combined.vcf -mode SNP -resource:hapmap,known=false,training=true,truth=true,prior=15.0 references/hg38gatkbundle/hapmap_3.3.hg38.vcf -resource:omni,known=false,training=true,truth=true,prior=12.0 references/hg38gatkbundle/1000G_omni2.5.hg38.vcf -resource:1000G,known=false,training=true,truth=false,prior=10.0 references/hg38gatkbundle/1000G_phase1.snps.high_confidence.hg38.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 references/hg38gatkbundle/Homo_sapiens_assembly38.dbsnp138.vcf -an DP -an QD -an FS -an SOR -an MQ -an MQRankSum -an ReadPosRankSum --maxGaussians 4 -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 -recalFile data/HG100/HG100.recalibrate.SNP.recal -tranchesFile data/HG100/HG100.recalibrate.SNP.tranches -rscriptFile data/HG100/HG100.recalibrate.SNP.plots.R


java -Xmx8g -jar algorithms/gatk3/gatk.jar -T ApplyRecalibration -R references/hg38gatkbundle/Homo_sapiens_assembly38.fasta -input data/HG100/HG100.output.raw.combined.vcf -mode SNP --ts_filter_level 99.5 -recalFile data/HG100/HG100.recalibrate.SNP.recal -tranchesFile data/HG100/HG100.recalibrate.SNP.tranches -o data/HG100/HG100.recalibrated.snp.vcf

Next I wanted to perform VariantRecalibration for InDels so use the following command:

java -Xmx8g -jar algorithms/gatk3/gatk.jar -T VariantRecalibrator -R references/hg38gatkbundle/Homo_sapiens_assembly38.fasta -input data/HG100/HG100.output.raw.combined.vcf -mode INDEL -resource:mills,known=false,training=true,truth=true,prior=12.0 references/hg38gatkbundle/Mills_and_1000G_gold_standard.indels.hg38.vcf -resource:dnsnp,known=true,training=false,truth=false,prior=2.0 references/hg38gatkbundle/Homo_sapiens_assembly38.dbsnp138.vcf -an QD -an DP -an FS -an SOR -an ReadPosRankSum -an MQRankSum -an InbreedingCoeff --maxGaussians 4  -recalFile data/HG100/HG100.recalibrate.INDEL.recal -tranchesFile data/HG100/HG100.recalibrate.INDEL.tranches -rscriptFile data/HG100/HG100.recalibrate.INDEL.plots.R

The issue I am facing is that when i run the above command I am getting an annotation related error for e.g QD annotation is not found on any input callsets. I have confirmed that the input VCf has the annotations since it was specified on the UnifiedGenotyper.

Any ideas why this is happening, i am trying to detect variants from a single BAM file.

Any help will be highly appreciated.

ngs • 581 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by win790

Any ideas as to why this is happening?

ADD REPLYlink written 9 months ago by win790

This link might be helpful

I'm guessing you are using a version < GATK 4.0? I don't know alot about it but I tried to recreate your error and the UnifiedGenotyper has been deprecated and is not recommended, however since in the documentation for it has the annotation flag as -A maybe you have to type QualByDepth like with the VarientAnnotator, instead of -A QD

also you can look here for some instructions on how to find the avaliablecommands

Good Luck and sorry I don't have a straight forward answer for you.

ADD REPLYlink written 9 months ago by skbrimer500
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1922 users visited in the last hour