I am using the VariantRecalibrator step to create a gaussian mixture model by looking at the annotations over a high quality subset of the input call set and then evaluate the input variants. After running the command I did not get any error message. It ran and I got my output files T_S7998.snps.VarRecal.tranches ,T_S7998.snps.VarRecal.plot.R,T_S7998.snps.VarRecal.tranches.pdf,T_S7998.snps.VarRecal.recal.idx, T_S7998.snps.VarRecal.recal, T_S7998.GATKsnps.VarRecal.log. I got the plot for the tranches which is in the .pdf format but the plot for the T_S7998.snps.VarRecal.plot.R which actually shows the true positive SNPs of my input call set with the training data sets in the gaussian mixture model did not get created. I see in the T_S7998.snps.VarRecal.plot.R file there is a path mentioned about the output plot pdf outputPDF <- "/scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.snps.VarRecal.plot.R.pdf". Even I dont see this file. I feel the ggplot2 library which is needed here is not installed in the R module we have in our cluster. I am running this script in the server so I cannot install the library. But can anyone suggest if this is an error of the library being uninstalled or something else. I am providing the command which I used to generate this plot below.
java -Xmx14g -jar /data/PGP/gmelloni/GenomeAnalysisTK-2.3-4-g57ea19f/GenomeAnalysisTK.jar -T VariantRecalibrator -R /scratch/GT/vdas/test_exome/exome/hg19.fa -input /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998.GATKsnps.raw.vcf -resource:hapmap,VCF,known=false,training=true,truth=true,prior=15.0 /scratch/GT/vdas/test_exome/exome/databases/hapmap_3.3.hg19.vcf -resource:omni,VCF,known=false,training=true,truth=false,prior=12.0 /scratch/GT/vdas/test_exome/exome/databases/1000G_omni2.5.hg19.vcf -resource:dbsnp,VCF,known=true,training=false,truth=false,prior=8.0 /scratch/GT/vdas/test_exome/exome/databases/dbsnp_137.hg19.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ --maxGaussians 4 -mode SNP -log /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.GATKsnps.VarRecal.log -recalFile /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.snps.VarRecal.recal -tranchesFile /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.snps.VarRecal.tranches -rscriptFile /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.snps.VarRecal.plot.R --percentBadVariants 0.05
The command looks good to me and even the resource training sets which I used to model my input call set with the training data set to obtain the true positives. It would be highly appreciated if anybody can give some suggestions to retrieve the plot.
Thanks a lot.