Question

NGS compute infrastructure

0

Entering edit mode

6.7 years ago

Cosmo ▴ 10

Hello,

I am a new PI working in the area of computational genomics, who is looking at setting up a compute infrastructure for my lab. There are two main computational tasks that the lab will be performing: 1. simulations (100-1000s of jobs each with a run time of a few seconds up to a few minutes) and 2. genome alignments, SNP calling, etc (only a few jobs but with higher RAM requirements). As such, I am looking into two different options: one system with a large amount of RAM but few CPUs and one with many CPUs with less required RAM or alternatively a solution where RAM can be temporarily shared (ideally with a RAID5 or RAID6). I would greatly appreciate if someone could share their experience with different compute architectures with me (as well as which companies they can recommend).

Thanks!

next-gen sequencing alignment • 1.4k views

ADD COMMENT • link 6.7 years ago by Cosmo ▴ 10

0

Entering edit mode

Make sure to have sufficient I/O capacity to really make full use of CPU and RAM. The best cluster makes no sense if the I/O bottleneck kills all the performance and permits to use multithreading effectively.

ADD REPLY • link 6.7 years ago by ATpoint 82k

0

Entering edit mode

I have no experience with this, but if you have a hard time estimating your needs you could also look at more flexible cloud-based solutions for which you pay what you use/need when you need it. Perhaps others have a different opinion about this.

ADD REPLY • link 6.7 years ago by WouterDeCoster 47k

0

Entering edit mode

From your post it seems that you may be conflating RAM with storage space.

RAM cannot be shared via a RAID - this latter word stands for "redundant array of independent disks" so they are hard drive storage systems no computer memory.

RAM - is the computer memory that programs can use when they run and are in tens into the hundreds of GB
RAID - this word describes the computer hard drive storage system, how much data can be stored in general. It typically starts at many terabytes.

As genomax states get as much RAM as possible hundreds of GB if possible.

ADD REPLY • link 6.7 years ago by Istvan Albert 100k

score 2 · Answer 1 · 2017-08-19

Real RAM (when needed has no functional replacement). If you don't have enough of it you simply would not able to run certain jobs. So no matter what you choose make sure you get at least as much RAM as you will need for the largest jobs (+ 10% to account for future needs) and then plan for the rest of hardware.