Question

Peer-Reviewed Justification For Big Ass Servers.

8

Entering edit mode

12.6 years ago

Yannick Wurm ★ 2.5k

Hi all,

I'm trying to help make a case to the admin of my uni to "unlock" funds for a Big Ass Server purchase. Part of the challenge is explaining that existing supercomputer facilities (thousands of low-mem nodes) are inadequate.

Jeremy Leipzig makes some excellent points. However, peer-reviewed publications are stronger evidence than blog posts. Do you know of any publications that support this idea?

A first one that indirectly supports the point is CAGE with the statement that for _de novo_ assembly most software "crashed, often after several days running on a 256-GB multi-core computer."

server memory • 5.3k views

ADD COMMENT • link updated 5.1 years ago by Biostar 20 • written 12.6 years ago by Yannick Wurm ★ 2.5k

7

Entering edit mode

i would be happy to contribute to any manuscript that has the phrase "big ass" in it

ADD REPLY • link 12.6 years ago by Jeremy Leipzig 22k

1

Entering edit mode

Hehe, here in the UK I've now repeatedly heard the variant "fuck-off big machines". For appropriate searchability both descriptions would have to be used.

ADD REPLY • link 12.6 years ago by Yannick Wurm ★ 2.5k

zx8754 · Answer 1 · 2013-05-19

5

Entering edit mode

11.4 years ago

Yannick Wurm ★ 2.5k

Microsoft helped us with this one:

Nobody ever got ﬁred for buying a cluster - Raja Appuswamy, Christos Gkantsidis, Dushyanth Narayanan, Orion Hodson, and Antony Rowstron - January 2013

Quoting the abstract:

In the last decade we have seen a huge deployment of cheap clusters to run data analytics workloads. The conventional wisdom in industry and academia is that scaling out using a cluster is better for these workloads than scaling up by adding more resources to a single server. Popular analytics infrastructures such as Hadoop are aimed at such a cluster scale-out environment, and in today's world nobody gets fired for adopting a cluster solution.

Is this the right approach? Our measurements as well as other recent work shows that the majority of real-world analytic jobs process less than 100GB of input, but popular infrastructures such as Hadoop/MapReduce were originally designed for petascale processing. We claim that a single "scale-up" server can process each of these jobs and do as well or better than a cluster in terms of performance, cost, power, and server density.

ADD COMMENT • link updated 5.1 years ago by zx8754 12k • written 11.4 years ago by Yannick Wurm ★ 2.5k

3

Entering edit mode

While this TR is interesting, it actually does not have much to do with bioinfo. It assumes that Hadoop essentially equates clusters (this point is clearer in the main text than the abstract) and that "the majority of real-world analytic jobs process less than 100GB of input". I have not read into details, but I guess the TR is going to argue that we can load all the data into RAM. NGS analysis is very different. Except assembly, embarrassing parallelization works well and it is rarely necessary to load all the data into memory.

In the real world, clusters dominate in large organizations. I can access 17k+ CPU cores from several institutes/universities. Only a few percent of them are part of big machines (say with 32 cores and 256GB RAM). That is because for most daily routines, we need CPUs more than RAM. If we turn these clusters into big machines under the same budget, I am sure these large organizations will have serious troubles with the server load.

In the areas I am familiar with, the only justification of big machines is de novo assembly. Other applications/programs can be written to take huge RAM, but most of them are unnecessary technically. On the other hand, I buy Jeremy's argument that for a small lab using its own machines, the human cost might be lower with big machines as hiring qualified admin to manage a cluster and training lab members to use a cluster may be more costly in the long run.

ADD REPLY • link 11.4 years ago by lh3 33k

2

Entering edit mode

Today's clusters are already being replaced with fewer nodes with more cores. My original post is rapidly becoming irrelevant because everyone is using clusters composed of big-ass servers.

ADD REPLY • link 11.4 years ago by Jeremy Leipzig 22k

0

Entering edit mode

My experience is that while yesterdays "big-ass" servers have become more common... today's definition of a "big-ass" server has shifted to even more CPU and memory. This trend has been hasted a bit by the increasing use of virtualisation and concerns about energy costs (big servers generally cost less to run than the equivalent capacity of small servers).

ADD REPLY • link 11.4 years ago by Hamish ★ 3.3k

0

Entering edit mode

thanks for following up on this - very interesting

ADD REPLY • link 11.4 years ago by Istvan Albert 101k

0

Entering edit mode

The title is a bit too confident: "Nobody ever got ﬁred for buying a cluster", never ever? Really?

ADD REPLY • link 11.3 years ago by zx8754 12k

Ram · Answer 2 · 2013-06-26

Two recent software programs that take advantage of Big Ass Server Technology:

Isaac

http://bioinformatics.oxfordjournals.org/content/early/2013/06/04/bioinformatics.btt314.abstract

An ultrafast DNA sequence aligner (Isaac Genome Alignment Software) that takes advantage of high memory hardware (>48GB) and variant caller (Isaac Variant Caller) have been developed. We demonstrate that our combined pipeline (Isaac) is 4-5 times faster than BWA+GATK on equivalent hardware

Star

http://bioinformatics.oxfordjournals.org/content/29/1/15

STAR’s high mapping speed is traded off against RAM usage: STAR requires ∼27 GB of RAM for aligning to the human genome. Like all other aligners, with the exception of RUM, the amount of RAM used by STAR does not increase significantly with the number of threads, as the SA is shared among all threads. Although STAR’s RAM requirements would have been prohibitively expensive several years ago, at the time when the first short read aligners were developed, recent progress in semiconductor technologies resulted in a substantial drop of RAM prices, and modern high performance servers are commonly equipped with RAM >32 GB. STAR has an option to use sparse SAs, reducing the RAM consumption to <16 GB for the human genome at the cost of ∼25% decrease in the mapping speed, while maintaining the alignment accuracy.