Question: Improve the speed of Haplotypecaller
0
gravatar for gubrins
8 months ago by
gubrins60
gubrins60 wrote:

Good afternoon, I'm working with gatk and more specifically with the Haplotypecaller function in order to create the GVCF files and I've seen that it takes quite a lot of time. I'm a bit in a rush and I would like to speed up the process, but I did not find useful information about it. Here is my code:

java -jar ~/softwares/GATKK/gatk/gatk-package-4.1.7.0-local.jar HaplotypeCaller --reference Pmuralis_1.0.fa --input run2_mergeandaligned.bam --output run2_4096_mergeandaligned.g.vcf -ERC GVCF

The only thing that seems to improve a bit the process is if I add -Xmx4096m to the beginning, like this:

java -Xmx4096m -jar ~/softwares/GATKK/gatk/gatk-package-4.1.7.0-local.jar HaplotypeCaller --reference Pmuralis_1.0.fa --input run2_mergeandaligned.bam --output run2_4096_mergeandaligned.g.vcf -ERC GVCF

Another thing I noticed is this message:

20:58:54.759 INFO  IntelPairHmm - Available threads: 20
20:58:54.759 INFO  IntelPairHmm - Requested threads: 4

As in my server I have 20 cores but the process is just taking 4. I think I solved it adding -native-pair-hmm-threads 20 but it didn't speed up the process... Let's see if somebody knows about java and can help me!

Thank you very much!

bam haplotypecaller gatk vcf • 660 views
ADD COMMENTlink modified 8 months ago by Pierre Lindenbaum133k • written 8 months ago by gubrins60
1

you asked several questions on this forum without validating the answers: Coverage per gene : StrandBiasBySample error Haplotypecaller ; etc... please validate the correct answers (green mark on the left...)

ADD REPLYlink written 8 months ago by Pierre Lindenbaum133k
1

Sorry Pierre, I'm new here and I didn't know about it, I've just done it and thanks again for your help before! :)

ADD REPLYlink written 8 months ago by gubrins60
2
gravatar for Pierre Lindenbaum
8 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum133k wrote:

. I'm a bit in a rush and I would like to speed up the process, but I did not find useful information about it.

split by chromosome using option -L and run in parallel.

ADD COMMENTlink written 8 months ago by Pierre Lindenbaum133k

I was planning to do that, thanks! My question was more related to java parameters, which I don't know and didn't find anything about it.

ADD REPLYlink written 8 months ago by gubrins60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1631 users visited in the last hour
_