Question: Computational resources for WGS variant calling
3
gravatar for alesssia
9 months ago by
alesssia560
London, UK
alesssia560 wrote:

Dear all,

we have WGS data for about 2000 individuals (30x, ~100G per file). We would like to align them using bwakit, and then perform the variant calling using GATK haplotype caller, something I have never done before at this scale (and with such large files)

We have limited computational resources, and we will be applying for an external OpenStack cluster (something I am not familiar with), for which I need to prepare a list of computational requirements, and I would like to gather some suggestions from someone more expert than me.

In your opinion, how much memory would I need for each sample? And how long will it take?

I have been told that each node in the OpenStack cluster is 48 cores, with 512GB of RAM (therefore it would be 24 x 256, 12 x 128 etc.), with a local disk of 50GB and local storage mounted via NFS.

Thank you very much in advance, any suggestion will be highly appreciated!

ADD COMMENTlink modified 8 months ago by Jeremy Leipzig19k • written 9 months ago by alesssia560
3
gravatar for Jeremy Leipzig
8 months ago by
Philadelphia, PA
Jeremy Leipzig19k wrote:

you have enough memory (GATK 3.X might take 64GB for some steps). In theory, you might finish this in 12 days or so.

The local storage for scratch space might be an issue, maybe 50GB is a typo? My phone has more than that.

Perhaps, more importantly, is that you bring an experienced bioinformatics engineer on board to design this pipeline and the proper handling of sequencing and technical metadata. Otherwise, debugging the pipeline and handling subsequent runs will take far more time than a computer could ever waste.

ADD COMMENTlink modified 8 months ago • written 8 months ago by Jeremy Leipzig19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1102 users visited in the last hour