Question: CPU/RAM resources for variant calling
gravatar for Bogdan
4.0 years ago by
Palo Alto, CA, USA
Bogdan950 wrote:

Dear all,

we have just set up a cluster with 4 nodes (128GB and 32 CPUs per node). Please could you let me know what would be the optimal configuration (RAM/CPU) in order to run GATK/Mutect on a node or any other variant calling software (such as Strelka, Varscan, SomaticSnipper) ? many thanks,

-- bogdan

snp next-gen • 1.6k views
ADD COMMENTlink written 4.0 years ago by Bogdan950

I can't give you any hard numbers, but I think you will be memory-bound before you are cpu-bound. I have 64Gb of RAM and I could run 4 instances of GATK before I maxed out (different components of the pipeline use more/less memory, but it seemed 4 was a good number for essentially every step). However that is with the default configuration running every step in parallel naively. There is plenty of space for optimization by changing parameters of GATK/Picard and using more sophisticated pipelining such that high-memory jobs are run with low-memory jobs.

Also, do not neglect the amount of disk-space you'll need! Other users of Biostars have commented that GATK uses a ton of space in temp files, but this can be overcome by diverging from the best practices and piping things together better. Furthermore, recent versions of the HaplotypeCaller include some element of BQSR and IndelRealignment built-in, so those separate steps can maybe be skipped without much of a difference to the final SNP calling. I haven't done either of those things, but it suggests that you will probably start with a pipeline that runs 8-9 jobs in parallel, and you will be able to tune things to bring that number up and maximize your resources as you learn more about your data and how these tools work.

ADD REPLYlink written 4.0 years ago by John12k

thanks a lot John fro sharing your experience with GATK !

ADD REPLYlink written 4.0 years ago by Bogdan950
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1089 users visited in the last hour