Question: Difficulty installing GATKtoolkit
0
gravatar for jaqx008
8 months ago by
jaqx00830
jaqx00830 wrote:

Hey guys. I am trying to perform variant calling on a whole genome. the command requires I install some applications. I was able to install all except the GATK-toolkit. I downloaded the latest version and unable to install it. I was told to point the directory to the folder containing the GATK but this did not work. Does any body has an idea how I can successfully install this tool?

ADD COMMENTlink written 8 months ago by jaqx00830

Link the page to the directions for installing GATK that you are following. Also list specific commands you used and any errors that you got.

ADD REPLYlink written 8 months ago by genomax55k

the link to the page to install GATK https://gatkforums.broadinstitute.org/gatk/discussion/2899/howto-install-all-software-packages-required-to-follow-the-gatk-best-practices

The command I ran was

$ java -jar $/Users/flynt/Downloads/gatk-4.0.0.0/FATK.jarfiles -T HaplotypeCAller -R /Volumes/500GB/pavelAssemblyBowtieIndex/DfarGenome.fa -I /Volumes/500GB/pavelAssemblyBowtieIndex/dustMapping.sorted.bam -o SNPs.vcf
Error: Unable to access jarfile $/Users/flynt/Downloads/gatk-4.0.0.0/FATK.jarfiles
ADD REPLYlink modified 8 months ago by genomax55k • written 8 months ago by jaqx00830

You appear to have an extra $ before the jar files path. I am assuming rest of the file names are correct. Try the following. This version of GATK requires Java 1.8. Hopefully that is what you have.

$ java -jar /Users/flynt/Downloads/gatk-4.0.0.0/FATK.jarfiles -T HaplotypeCAller -R /Volumes/500GB/pavelAssemblyBowtieIndex/DfarGenome.fa -I /Volumes/500GB/pavelAssemblyBowtieIndex/dustMapping.sorted.bam -o SNPs.vcf
ADD REPLYlink modified 8 months ago • written 8 months ago by genomax55k

The extra $ is the one preceding the current directory. But ok, let me try.

ADD REPLYlink written 8 months ago by jaqx00830

I got this error. Error: Invalid or corrupt jarfile /Users/flynt/Downloads/gatk-4.0.0.0/FATK.jarfiles

ADD REPLYlink written 8 months ago by jaqx00830

I am not sure what you have downloaded since I can't see these files in GATK v.4.0 that I just downloaded and unzipped. I just have an executable called gatk in gatk-4.0.0.0 folder and if I run it I get this.

./gatk

 Usage template for all tools (uses --spark-runner LOCAL when used with a Spark tool)
    gatk AnyTool toolArgs
ADD REPLYlink modified 8 months ago • written 8 months ago by genomax55k

Thanks. Looks good, but the error shifted to the Haplotype part. Error: Unable to access jarfile HaplotypeCAller

ADD REPLYlink written 8 months ago by jaqx00830

Commands for GATK v.4.0 appear to be different.

You may want to consider going back to GATK v.3.x, if you continue to have issues. GATK v.4.0 is brand new and the documentation may not be current/available in all cases.

ADD REPLYlink modified 8 months ago • written 8 months ago by genomax55k

You are right. Let me try that and see what happens.

ADD REPLYlink written 8 months ago by jaqx00830

I got the 3.3 version and it worked out with the example files that comes with it. Thanks for your suggestions. Thanks again. my command is now

java -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R DfarGenome.fa -I dustMapping.sorted.bam -o Dustmapping.vcf

I have ensured all the supporting files .bai, .fai, .dict are available in the folder. yet I get the following error

##### ERROR MESSAGE: Fasta dict file /Volumes/500GB/pavelAssemblyBowtieIndex/DfarGenome.dict for reference /Volumes/500GB/pavelAssemblyBowtieIndex/DfarGenome.fa does not exist. Please see http://gatkforums.broadinstitute.org/discussion/1601/how-can-i-prepare-a-fasta-file-to-use-as-reference for help creating it.
##### ERROR
ADD REPLYlink modified 8 months ago by genomax55k • written 8 months ago by jaqx00830

Perhaps you need to amend your $PATH. export PATH=/Volumes/500GB/pavelAssemblyBowtieIndex/:$PATH

ADD REPLYlink modified 8 months ago • written 8 months ago by genomax55k

Its still generating same error. Frustrating... :(

ADD REPLYlink written 8 months ago by jaqx00830

You may need to recreate .dict files with v.3.x, if you had done them with v 4.0 previously. Those two versions are likely incompatible.

ADD REPLYlink modified 8 months ago • written 8 months ago by genomax55k

Are you talking about the picard version?

ADD REPLYlink written 8 months ago by jaqx00830

Sorry. I forgot that .dict files are made using picard. Not sure why are you getting that error if all three files are in the same directory. Try re-creating the dict file just in case.

ADD REPLYlink modified 8 months ago • written 8 months ago by genomax55k

Using a case. I am not sure what this mean.

ADD REPLYlink written 8 months ago by jaqx00830

If by some chance either .dict and/or .fai files are corrupt (which may be causing the error message) you could try to re-create both and see if that helps.

ADD REPLYlink written 8 months ago by genomax55k

I just did. The error is still on the .dict. or could it be that my reference genome is a .fa? this should be the same as .fasta I guess.

ADD REPLYlink written 8 months ago by jaqx00830

Hello Genomax. I realized my dict file name was not same as the fast file name. I corrected that and that error was resolved. I have a new error which says my bam lacks read group and I found a command to add the read groups.I am currently running the command. Thought I should keep you posted. Thanks again.

ADD REPLYlink written 8 months ago by jaqx00830
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1215 users visited in the last hour