Question: QD Annotation error in GATK in VariantRecalibration step
gravatar for win
2.0 years ago by
win820 wrote:

hi all, i downloaded a BAM file from 1000 genomes to work on and converted it to FASTQ and aligned with bwa mem to hg38, this worked fine.

later I wanted to run VariantRecalibration and use the following commands for SNP and they worked fine as well.

 java -Xmx8g -jar algorithms/gatk3/gatk.jar -T VariantRecalibrator -R references/hg38gatkbundle/Homo_sapiens_assembly38.fasta -input data/HG100/HG100.output.raw.combined.vcf -mode SNP -resource:hapmap,known=false,training=true,truth=true,prior=15.0 references/hg38gatkbundle/hapmap_3.3.hg38.vcf -resource:omni,known=false,training=true,truth=true,prior=12.0 references/hg38gatkbundle/1000G_omni2.5.hg38.vcf -resource:1000G,known=false,training=true,truth=false,prior=10.0 references/hg38gatkbundle/1000G_phase1.snps.high_confidence.hg38.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 references/hg38gatkbundle/Homo_sapiens_assembly38.dbsnp138.vcf -an DP -an QD -an FS -an SOR -an MQ -an MQRankSum -an ReadPosRankSum --maxGaussians 4 -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 -recalFile data/HG100/HG100.recalibrate.SNP.recal -tranchesFile data/HG100/HG100.recalibrate.SNP.tranches -rscriptFile data/HG100/HG100.recalibrate.SNP.plots.R


java -Xmx8g -jar algorithms/gatk3/gatk.jar -T ApplyRecalibration -R references/hg38gatkbundle/Homo_sapiens_assembly38.fasta -input data/HG100/HG100.output.raw.combined.vcf -mode SNP --ts_filter_level 99.5 -recalFile data/HG100/HG100.recalibrate.SNP.recal -tranchesFile data/HG100/HG100.recalibrate.SNP.tranches -o data/HG100/HG100.recalibrated.snp.vcf

Next I wanted to perform VariantRecalibration for InDels so use the following command:

java -Xmx8g -jar algorithms/gatk3/gatk.jar -T VariantRecalibrator -R references/hg38gatkbundle/Homo_sapiens_assembly38.fasta -input data/HG100/HG100.output.raw.combined.vcf -mode INDEL -resource:mills,known=false,training=true,truth=true,prior=12.0 references/hg38gatkbundle/Mills_and_1000G_gold_standard.indels.hg38.vcf -resource:dnsnp,known=true,training=false,truth=false,prior=2.0 references/hg38gatkbundle/Homo_sapiens_assembly38.dbsnp138.vcf -an QD -an DP -an FS -an SOR -an ReadPosRankSum -an MQRankSum -an InbreedingCoeff --maxGaussians 4  -recalFile data/HG100/HG100.recalibrate.INDEL.recal -tranchesFile data/HG100/HG100.recalibrate.INDEL.tranches -rscriptFile data/HG100/HG100.recalibrate.INDEL.plots.R

The issue I am facing is that when i run the above command I am getting an annotation related error for e.g QD annotation is not found on any input callsets. I have confirmed that the input VCf has the annotations since it was specified on the UnifiedGenotyper.

Any ideas why this is happening, i am trying to detect variants from a single BAM file.

Any help will be highly appreciated.

ngs • 1.9k views
ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by win820

Any ideas as to why this is happening?

ADD REPLYlink written 2.0 years ago by win820

This link might be helpful

I'm guessing you are using a version < GATK 4.0? I don't know alot about it but I tried to recreate your error and the UnifiedGenotyper has been deprecated and is not recommended, however since in the documentation for it has the annotation flag as -A maybe you have to type QualByDepth like with the VarientAnnotator, instead of -A QD

also you can look here for some instructions on how to find the avaliablecommands

Good Luck and sorry I don't have a straight forward answer for you.

ADD REPLYlink written 2.0 years ago by skbrimer610
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 711 users visited in the last hour