9.2 years ago by
Washington University in St. Louis, MO
Okay, well then I'll go ahead and throw some info out there in the hopes that it's useful to you.
What I can tell you is that the cluster we share time on has 8-core machines with 16GB of RAM each and they're sufficient for most of our needs. We don't do much assembly, but we do do a ton of other genomic processing, ranging from mapping short reads all the way up to snp calling and pathway inference. I also still do a fair amount of array processing.
Using most cluster management tools, (PBS, LSF, whatever), it should be possible to allow a user to reserve more than one CPU per node, effectively giving them up to 16 GB for a process if they reserve the whole node. Yeah, that means some lost cycles, but I don't seem to use it that often - 2GB is still sufficient for most things I run. It'd also be good to set up a handful of machines with a whole lot of RAM - maybe 64GB? That gives users who are doing things like assembly or loading huge networks into RAM some options.
I more often run into limits on I/O. Giving each machine a reasonably sized scratch disc and encouraging your users to make smart use of it is a good idea. Network filesystems can be bogged down really quickly when a few dozen nodes are all reading and writing data. If you're going to be doing lots of really I/O intensive stuff (and dealing with short reads, you probably will be), it's probably worth looking into faster hard drives. Certainly 7200RPM, if not 10k. Last time I looked 15k drives were available, but not worth it in terms of price/performance. That may have changed.
I won't get into super-detail on the specs - you'll have to price that out and see where the sweet spot is. I also won't tell you how many nodes to get, because again, that depends on your funding. I will say that if you're talking a small cluster for a small lab, it may make sense to just get 3 or 4 machines with 32 cores and a bunch of RAM, and not worry about trying to set up a shared filesystem, queue, etc - it really can be a headache to maintain. If you'll be supporting a larger userbase, though, then you may find a better price point at less CPUs per node, and have potentially fewer problems with disk I/O (because you'll have less CPUs per HD).
People who know more about cluster maintenance and hardware than I do, feel free to chime in with additions or corrections.