Question

R Plot Did Not Get Generated Using The Variantrecalibrator

0

Entering edit mode

11.6 years ago

ivivek_ngs ★ 5.2k

Dear All,

I am using the VariantRecalibrator step to create a gaussian mixture model by looking at the annotations over a high quality subset of the input call set and then evaluate the input variants. After running the command I did not get any error message. It ran and I got my output files T_S7998.snps.VarRecal.tranches ,T_S7998.snps.VarRecal.plot.R,T_S7998.snps.VarRecal.tranches.pdf,T_S7998.snps.VarRecal.recal.idx, T_S7998.snps.VarRecal.recal, T_S7998.GATKsnps.VarRecal.log. I got the plot for the tranches which is in the .pdf format but the plot for the T_S7998.snps.VarRecal.plot.R which actually shows the true positive SNPs of my input call set with the training data sets in the gaussian mixture model did not get created. I see in the T_S7998.snps.VarRecal.plot.R file there is a path mentioned about the output plot pdf outputPDF <- "/scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.snps.VarRecal.plot.R.pdf". Even I dont see this file. I feel the ggplot2 library which is needed here is not installed in the R module we have in our cluster. I am running this script in the server so I cannot install the library. But can anyone suggest if this is an error of the library being uninstalled or something else. I am providing the command which I used to generate this plot below.

java -Xmx14g -jar /data/PGP/gmelloni/GenomeAnalysisTK-2.3-4-g57ea19f/GenomeAnalysisTK.jar -T VariantRecalibrator -R /scratch/GT/vdas/test_exome/exome/hg19.fa -input /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998.GATKsnps.raw.vcf -resource:hapmap,VCF,known=false,training=true,truth=true,prior=15.0 /scratch/GT/vdas/test_exome/exome/databases/hapmap_3.3.hg19.vcf -resource:omni,VCF,known=false,training=true,truth=false,prior=12.0 /scratch/GT/vdas/test_exome/exome/databases/1000G_omni2.5.hg19.vcf -resource:dbsnp,VCF,known=true,training=false,truth=false,prior=8.0 /scratch/GT/vdas/test_exome/exome/databases/dbsnp_137.hg19.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ --maxGaussians 4 -mode SNP -log /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.GATKsnps.VarRecal.log -recalFile /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.snps.VarRecal.recal -tranchesFile /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.snps.VarRecal.tranches -rscriptFile /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.snps.VarRecal.plot.R --percentBadVariants 0.05

The command looks good to me and even the resource training sets which I used to model my input call set with the training data set to obtain the true positives. It would be highly appreciated if anybody can give some suggestions to retrieve the plot.

Thanks a lot.

gatk exome-sequencing variant-calling snps • 3.9k views

ADD COMMENT • link updated 11.6 years ago by Sean Davis 27k • written 11.6 years ago by ivivek_ngs ★ 5.2k

score 0 · Answer 1 · 2013-11-25

0

Entering edit mode

11.6 years ago

Sean Davis 27k

It sounds like you know that this should not work since the necessary software and libraries are not installed. How about installing ggplot2 and any other requirements and trying again? Just FYI, you can likely install ggplot2 into your home directory and things will work fine. If in doubt, just have the sysadmin install it.

ADD COMMENT • link 11.6 years ago by Sean Davis 27k

0

Entering edit mode

I just wanted to be sure and wanted a second opinion from you guys. But to my despair I actually installed the ggplot2 library in my hoe directory and reran the scripts again hoping that they will create the plot now. But it did not. The T_S7998.snps.VarRecal.plot.R file is created but it does not give any T_S7998.snps.VarRecal.plot.R.pdf in the mentioned path inside the *.R file and the path mentioned is outputPDF <- "/scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998_VarRecal/T_S7998.snps.VarRecal_2.plot.R.pdf". I am a bit confused as to why this unable to do so now. This script mentioned above creates 2 plots , one is a normal one which got created second time as well and the other one which uses the ggplot2 library did not. Do I have to load the R module before running the script but that should not be needed right? Please let me know if anyone can come up with some suggestions.

ADD REPLY • link 11.6 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

or do I have to run the T_S7998.snps.VarRecal.plot.R again once it is being created in the VariantRecalibration step and then generate the plots?

ADD REPLY • link 11.6 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Thanks I managed to load the library locally in my home directory and then rerun the R script and its now generating the plots

ADD REPLY • link 11.6 years ago by ivivek_ngs ★ 5.2k