Question: How to Create a Truth & training set for Variant Recalibrator
0
gravatar for yoh1242
5 weeks ago by
yoh12420
yoh12420 wrote:

Hello! I need a step to step Guide on how to Create a Truth and Training set for VariantRecalibrator GATK v4.1.4. The Guide available on GATK website was not helpful enough and I have yet to make progress.

I have downloaded a list of known indels BQSR file from 1000bullgenome website and plan to use it for the -resource part of the command line. Kindly mention the steps necessary to turn this file into a suitable resource file for variant recalibrator step of GATK.

sequencing snp next-gen genome • 115 views
ADD COMMENTlink written 5 weeks ago by yoh12420

was not helpful enough

Where did you face problems? Which guide are you referring to and how far did you get?

ADD REPLYlink written 5 weeks ago by RamRS25k

I have performed HaplotypeCaller but can not proceed with VariantRecalibrator due to lack of resource file.

I have downloaded 1000bullgenome.vcf and performed VariantCalling and VariantFIltration on the file but when I run the Commandline.

java -Xmx8g -jar /home/zafar/miniconda3/share/gatk4-4.1.4.1-0/gatk-package-4.1.4.1-local.jar VariantRecalibrator -R UMD3.1_chromosomes.fa -V H-BQSRSunny2.vcf -resource:1000bullgenome,known=true,training=true,truth=true,prior=6.0 ARSUMD.Filtered.vcf -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -mode SNP -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 -O Sunny2VR.recal -tranches-file Sunny2VR.tranches -rscript-file Sunny2VR.plots.R

I get the following Error:

A USER ERROR has occurred: Bad input: Values for QD annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations.
A USER ERROR has occurred: Bad input: Values for QD annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations.

I have tried both GATK Variant Annotator and snpEff on the resource vcf file but nothing is working.

ADD REPLYlink modified 4 weeks ago by RamRS25k • written 4 weeks ago by yoh12420

You've still not pointed us to the "guide available on GATK website" you refer to.

ADD REPLYlink written 4 weeks ago by RamRS25k

I cant seem to find the link on the new GATK forum. Regardless, The steps were as follows: Perform SelectVariants, followed by VariantFiltration and finish it with Variant Annotation on the known SNP and Indel vcf file to create a Training Set resource file. Despite following this step I have yet to yield any result

ADD REPLYlink written 4 weeks ago by yoh12420

Hello yoh1242!

Is this a follow-up question for above thread or are you basically asking the same question again?

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax78k

DIfferent but same question.

I am asking for sites where I can download -resource vcf files for this very question mentioned here

ADD REPLYlink written 4 weeks ago by yoh12420
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 724 users visited in the last hour