Question: Gatk Variantrecalibrator Is Aborted By Some Problems.
gravatar for Chris
8.3 years ago by
Chris40 wrote:

Hi @all! It is a question about GATK VariantRecalibrator.

The data I use containing 50 simples at 15X average exome sequencing. Everything seems well at the beginning. But Errors come out in the end:

<h5>ERROR MESSAGE: NaN LOD value assigned. Clustering with this few variants and these annotations is unsafe. Please consider raising the number of variants used to train the negative model (via --percentBadVariants 0.05, for example) or lowering the maximum number of Gaussians to use in the model (via --maxGaussians 4, for example)</h5>

The Command I used: java -Xmx1555m -jar /home/chris/install/GenomeAnalysisTK-1.6-9-g47df7bb/GenomeAnalysisTK.jar -R /home/chris/data/hg/ucsc.hg19.fasta -T VariantRecalibrator -input /home/chris/data/train/ -resource:hapmap,known=false,training=true,truth=true,prior=15.0 /home/chris/data/train/hapmap_3.3.hg19.sites.vcf -resource:omni,known=false,training=true,true=false,prior=12.0 /home/chris/data/train/1000G_omni2.5.hg19.sites.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=8.0 /home/chris/data/hg/dbsnp_135.hg19.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an InbreedingCoeff -an FS -an DP -an MQ -an InbreedingCoeff -recalFile /home/chris/SRR_50bam.recal -tranchesFile /home/chris/SRR_50bam.tranches -rscriptFile /home/chris/plots.R -nt 2 -mG 4 -percentBad 0.05 -L /home/chris/data/train/exome.bed

Notice that I have already used -mG 4 -percentBad 0.05 parameters. a INFO said: INFO 11:49:45,070 VariantDataManager - Additionally training with worst 5.000% of passing data --> 3942 variants with LOD <= 0.0000.

Somewhere said that it seems like it was the negative model using the worst X percent of variants is too low. But when I change the -percentBad to 0.15, error still appears. The INFO: INFO 13:10:13,059 VariantDataManager - Additionally training with worst 15.000% of passing data --> 11825 variants with LOD <= 0.0000. The LOD is still 0. I don't know why process can't complete. And what's LOD? My raw VCF called almost 80000 SNPs. It's really the 3942 or 11825 variants not enough?

Here is a simple sample in my raw VCF: chr1 881627 rs2272757 G A 1249.50 . AC=52;AF=0.650;AN=80;BaseQRankSum=4.484;DB;DP=280;Dels=0.00;FS=0.000;HRun=1;HaplotypeScore=0.4991;InbreedingCoeff=0.1468;MQ=34.16;MQ0=5;MQRankSum=-1.772;QD=5.98;ReadPosRankSum=0.671;SB=-629.98 GT:AD:DP:GQ:PL 0/1:3,5:8:64.89:68,0,65 ./. ........... 0/0:2,0:2:3:0,3,25 chr1 881784 . C T 124.05 . AC=2;AF=0.021;AN=96;BaseQRankSum=0.481;DP=430;Dels=0.00;FS=17.640;HRun=1;HaplotypeScore=0.8485;InbreedingCoeff=-0.0583;MQ=39.06;MQ0=3;MQRankSum=-0.937;QD=4.00;ReadPosRankSum=-0.505;SB=-2.11 GT:AD:DP:GQ:PL 0/0:8,0:8:21.03:0,21,201 ./. ..............0/0:4,0:4:6.01:0,6,61

I am so sad! Look forward to your reply!

gatk • 4.1k views
ADD COMMENTlink modified 7.3 years ago by Jorge Amigo12k • written 8.3 years ago by Chris40
gravatar for Jorge Amigo
7.3 years ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

just not to leave this question unanswered, this has been covered Gatk Variantrecalibrator Error Message.

ADD COMMENTlink written 7.3 years ago by Jorge Amigo12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1581 users visited in the last hour