Any Hardware Recommendations For A Molecular Biology Lab That'S Getting Into Bioinformatics?
3
12
Entering edit mode
13.7 years ago
Colin ▴ 120

I am a technician in a relatively small molecular biology lab at a large university in the Northeast. We are beginning to get involved in more computationally intense work -- lots of ChIP(-seq, -chip) and next gen sequencing -- and are looking at expanding our computational capacity. We currently rely on a few 32-bit iMacs to do the bulk of our computing. The way we see it currently, we have the following options:

  1. Do our big stuff using a cloud service like Amazon's EC2.
  2. Purchase time on our university's cluster.
  3. Build a local computing grid. The first two of these options do not appear to be cost-effective for us at this time, and the third appears to be too costly in terms of construction and maintenance. We are more interested in the following options:
  4. Purchase a MacPro and use it as a dedicated computer for alignment and analysis.
  5. Purchase an IBM x-series server and use it for the same purpose (using a Linux distro).
  6. Construct a small in-lab cluster. Most of our current computers are Apple, and everyone in the lab is comfortable with Mac OS and are a little reluctant to switch to Linux.

Personally, I think that going the IBM route would be best for us. The servers appear to be pretty good and I think it would be relatively painless for the lab to transition to Linux, especially since most of the analysis we do is based on Python scripts. So, I'd just like to know: How do other molecular biology or biochemistry labs do their bioinformatics? If we were to purchase our own hardware, what would people recommend? It seems like every bioinformatics book recommends the LAMP toolkit, but will we have any issues finding software or applications for Linux that my lab-mates are used to using on Mac OS?

Edit: Removed Reddit reference.

next-gen sequencing hardware alignment chip-seq cloud • 12k views
ADD COMMENT
0
Entering edit mode

You may have an easier time getting some tools to compile on linux than OSX.

ADD REPLY
0
Entering edit mode

I love Macs and use them all the time, but I have to agree with Khader. I've spent a lot of time the last few weeks installing bioinformatics software on our Power Mac, and I've been surprised at how many tools don't "just work" like they do on Linux. Some of them I had to hack a bit to get them to compile, some aren't available for OS X at all.

ADD REPLY
0
Entering edit mode

I love Macs and use them all the time, but I have to agree with Aaron. I've spent a lot of time the last few weeks installing bioinformatics software on our Mac Pro, and I've been surprised at how many tools don't "just work" like they do on Linux. Some of them I had to hack a bit to get them to compile, some aren't available for OS X at all.

ADD REPLY
0
Entering edit mode

Khader, Did you buy hardware for your NGS workflow? Can you elaborate what you finalized? I am looking at same question and would like some insights. Thanks ~JVJ~

ADD REPLY
0
Entering edit mode
ADD REPLY
10
Entering edit mode
13.7 years ago
Neilfws 49k

Thanks reddit? Ah, I see :-)

I was responsible for bioinformatics in a small microbiology lab for several years. We began by making use of whatever machines were lying around, graduated to dedicated servers and eventually, built a small (14-node) cluster.

I'd avoid building your own cluster (unless you really want to). It's a lot of hassle in terms of maintenance, space, power and cooling. For really "big stuff", the university cluster or EC2 sound like the way to go.

In terms of other hardware, keep it cheap and generic (i.e. easy to replace on failure). To me, that says x-series or similar, not Mac. Make sure you have plenty of storage, backup and enough RAM/CPU for several people to login and do their stuff at the same time.

In terms of software - if they can use the OSX terminal, they can use Linux. We're not talking about the pretty desktop experience, we're talking a utility machine that people login to remotely - not sit at. On that point, whatever it is, keep it dedicated to bioinformatics - you don't want people using it as their desktop (reading email, web browsing not allowed!)

You will have no issues finding Linux software - there are tens of thousands of freely-available packages, in good distros like Ubuntu. It may be a little different to what OSX users are used to, but not so much that they'll be unable to use it.

Final tip - don't run everything off one machine. If you want to provide web applications/interfaces, run a dedicated web server. If you're database-intensive, run a dedicated database server. Then you might run a third, dedicated "compute server" with the applications. Tie them together with a switch (and buy the fastest switch you can afford).

ADD COMMENT
0
Entering edit mode

Haha. Yep, I cross-posted it to reddit. Thank you for your quick and thoughtful response. I agree with your comments on choosing the x-series over the Mac Pro. Your "Final tip" is also helpful to think about, and wasn't something our lab had discussed previously. Thanks.

ADD REPLY
9
Entering edit mode
13.7 years ago
brentp 24k

Do not build your own cluster. You'll have to pay someone to maintain it (or worse, you'll have to maintain it) and it'll be out of date in a couple years. If you need some computing power, you can now get a Dell machine (and presumably others) with at least 32 cores--so you don't have to shuffle data among nodes, you just do your work.

I recommend using linux, but if your lab is already comfortable with Mac OS, then that's close enough.

If you do go EC2, check out Brad Chapman's (and others?) http://www.cloudbiolinux.com/

ADD COMMENT
1
Entering edit mode

Agree, don't underestimate the challenges of building an maintaining your own hardware solution. Especially for small labs cloud computing is very cheap solution as you can start and stop servers on demand and only need to pay when there is work to do.

ADD REPLY
0
Entering edit mode

++ here, there's nothing worse being stuck supporting an ailing, out of date cluster. Big box of CPUs and RAM or EC2 is the way forward.

ADD REPLY
0
Entering edit mode

Brent: Do you have a specific link to access configuration/specs for a Dell machine with 32 cores ?

ADD REPLY
0
Entering edit mode
13.7 years ago
jvijai ★ 1.2k

Someone hawked a solution called POD http://www.penguincomputing.com/POD/HPC_as_a_service The company rents out nodes on HPC like EC2. Disclaimer: I dont work for them , nor receive any kind of benefits, nor am I a customer.

ADD COMMENT

Login before adding your answer.

Traffic: 2966 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6