Good afternoon, I'm working with gatk and more specifically with the Haplotypecaller function in order to create the GVCF files and I've seen that it takes quite a lot of time. I'm a bit in a rush and I would like to speed up the process, but I did not find useful information about it. Here is my code:
java -jar ~/softwares/GATKK/gatk/gatk-package-4.1.7.0-local.jar HaplotypeCaller --reference Pmuralis_1.0.fa --input run2_mergeandaligned.bam --output run2_4096_mergeandaligned.g.vcf -ERC GVCF
The only thing that seems to improve a bit the process is if I add -Xmx4096m
to the beginning, like this:
java -Xmx4096m -jar ~/softwares/GATKK/gatk/gatk-package-4.1.7.0-local.jar HaplotypeCaller --reference Pmuralis_1.0.fa --input run2_mergeandaligned.bam --output run2_4096_mergeandaligned.g.vcf -ERC GVCF
Another thing I noticed is this message:
20:58:54.759 INFO IntelPairHmm - Available threads: 20
20:58:54.759 INFO IntelPairHmm - Requested threads: 4
As in my server I have 20 cores but the process is just taking 4. I think I solved it adding -native-pair-hmm-threads 20
but it didn't speed up the process...
Let's see if somebody knows about java and can help me!
Thank you very much!
you asked several questions on this forum without validating the answers: Coverage per gene : StrandBiasBySample error Haplotypecaller ; etc... please validate the correct answers (green mark on the left...)
Sorry Pierre, I'm new here and I didn't know about it, I've just done it and thanks again for your help before! :)