Question: Any Hardware Recommendations For A Molecular Biology Lab That'S Getting Into Bioinformatics?
12
gravatar for Colin
8.8 years ago by
Colin120
Colin120 wrote:

I am a technician in a relatively small molecular biology lab at a large university in the Northeast. We are beginning to get involved in more computationally intense work -- lots of ChIP(-seq, -chip) and next gen sequencing -- and are looking at expanding our computational capacity. We currently rely on a few 32-bit iMacs to do the bulk of our computing. The way we see it currently, we have the following options:

  1. Do our big stuff using a cloud service like Amazon's EC2.
  2. Purchase time on our university's cluster.
  3. Build a local computing grid. The first two of these options do not appear to be cost-effective for us at this time, and the third appears to be too costly in terms of construction and maintenance. We are more interested in the following options:
  4. Purchase a MacPro and use it as a dedicated computer for alignment and analysis.
  5. Purchase an IBM x-series server and use it for the same purpose (using a Linux distro).
  6. Construct a small in-lab cluster. Most of our current computers are Apple, and everyone in the lab is comfortable with Mac OS and are a little reluctant to switch to Linux.

Personally, I think that going the IBM route would be best for us. The servers appear to be pretty good and I think it would be relatively painless for the lab to transition to Linux, especially since most of the analysis we do is based on Python scripts. So, I'd just like to know: How do other molecular biology or biochemistry labs do their bioinformatics? If we were to purchase our own hardware, what would people recommend? It seems like every bioinformatics book recommends the LAMP toolkit, but will we have any issues finding software or applications for Linux that my lab-mates are used to using on Mac OS?

Edit: Removed Reddit reference.

ADD COMMENTlink modified 8.8 years ago by jvijai1.1k • written 8.8 years ago by Colin120

You may have an easier time getting some tools to compile on linux than OSX.

ADD REPLYlink written 8.8 years ago by Aaron Statham1.1k

I love Macs and use them all the time, but I have to agree with Khader. I've spent a lot of time the last few weeks installing bioinformatics software on our Power Mac, and I've been surprised at how many tools don't "just work" like they do on Linux. Some of them I had to hack a bit to get them to compile, some aren't available for OS X at all.

ADD REPLYlink written 8.8 years ago by Daniel Standage3.9k

I love Macs and use them all the time, but I have to agree with Aaron. I've spent a lot of time the last few weeks installing bioinformatics software on our Mac Pro, and I've been surprised at how many tools don't "just work" like they do on Linux. Some of them I had to hack a bit to get them to compile, some aren't available for OS X at all.

ADD REPLYlink written 8.8 years ago by Daniel Standage3.9k

Khader, Did you buy hardware for your NGS workflow? Can you elaborate what you finalized? I am looking at same question and would like some insights. Thanks ~JVJ~

ADD REPLYlink written 8.6 years ago by jvijai1.1k

my views: http://jermdemo.blogspot.com/2011/06/big-ass-servers-and-myths-of-clusters.html

ADD REPLYlink written 7.9 years ago by Jeremy Leipzig18k
10
gravatar for Neilfws
8.8 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

Thanks reddit? Ah, I see :-)

I was responsible for bioinformatics in a small microbiology lab for several years. We began by making use of whatever machines were lying around, graduated to dedicated servers and eventually, built a small (14-node) cluster.

I'd avoid building your own cluster (unless you really want to). It's a lot of hassle in terms of maintenance, space, power and cooling. For really "big stuff", the university cluster or EC2 sound like the way to go.

In terms of other hardware, keep it cheap and generic (i.e. easy to replace on failure). To me, that says x-series or similar, not Mac. Make sure you have plenty of storage, backup and enough RAM/CPU for several people to login and do their stuff at the same time.

In terms of software - if they can use the OSX terminal, they can use Linux. We're not talking about the pretty desktop experience, we're talking a utility machine that people login to remotely - not sit at. On that point, whatever it is, keep it dedicated to bioinformatics - you don't want people using it as their desktop (reading email, web browsing not allowed!)

You will have no issues finding Linux software - there are tens of thousands of freely-available packages, in good distros like Ubuntu. It may be a little different to what OSX users are used to, but not so much that they'll be unable to use it.

Final tip - don't run everything off one machine. If you want to provide web applications/interfaces, run a dedicated web server. If you're database-intensive, run a dedicated database server. Then you might run a third, dedicated "compute server" with the applications. Tie them together with a switch (and buy the fastest switch you can afford).

ADD COMMENTlink written 8.8 years ago by Neilfws48k

Haha. Yep, I cross-posted it to reddit. Thank you for your quick and thoughtful response. I agree with your comments on choosing the x-series over the Mac Pro. Your "Final tip" is also helpful to think about, and wasn't something our lab had discussed previously. Thanks.

ADD REPLYlink written 8.8 years ago by Colin120
9
gravatar for brentp
8.8 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

Do not build your own cluster. You'll have to pay someone to maintain it (or worse, you'll have to maintain it) and it'll be out of date in a couple years. If you need some computing power, you can now get a Dell machine (and presumably others) with at least 32 cores--so you don't have to shuffle data among nodes, you just do your work.

I recommend using linux, but if your lab is already comfortable with Mac OS, then that's close enough.

If you do go EC2, check out Brad Chapman's (and others?) http://www.cloudbiolinux.com/

ADD COMMENTlink written 8.8 years ago by brentp23k
1

Agree, don't underestimate the challenges of building an maintaining your own hardware solution. Especially for small labs cloud computing is very cheap solution as you can start and stop servers on demand and only need to pay when there is work to do.

ADD REPLYlink written 8.8 years ago by Istvan Albert ♦♦ 80k

++ here, there's nothing worse being stuck supporting an ailing, out of date cluster. Big box of CPUs and RAM or EC2 is the way forward.

ADD REPLYlink written 8.8 years ago by Daniel Swan13k

Brent: Do you have a specific link to access configuration/specs for a Dell machine with 32 cores ?

ADD REPLYlink written 8.6 years ago by Khader Shameer18k
0
gravatar for jvijai
8.8 years ago by
jvijai1.1k
United States
jvijai1.1k wrote:

Someone hawked a solution called POD http://www.penguincomputing.com/POD/HPC_as_a_service The company rents out nodes on HPC like EC2. Disclaimer: I dont work for them , nor receive any kind of benefits, nor am I a customer.

ADD COMMENTlink written 8.8 years ago by jvijai1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1929 users visited in the last hour