Question: Hardware requirements for NGS analysis
1
gravatar for eb0906
4.6 years ago by
eb090610
United States
eb090610 wrote:

Our small bioinformatics core bought two Dell Precision T7610 Tower Workstations equipped with 1 Intel Xeon E5-2687W v2 Eight-core 3.4 GHz Turbo, 25 MB processor, 64 GB 1866MHz DDR3 RAM, 1GB NVIDIA Quadro K600 Video card, 256 GB Solid-state drive and two 1TB SATA drives, DVD-RW drive, 10Gb Network adapter, and an Nvidia Tesla K20C Computer Processor. 

I am a novice user, but some initial thoughts I have are:

  1. Do we have enough RAM to support multiple (2-3) RNA-seq analyses? For example, alignments, mapping, differential expression analysis, etc. Most of our work will involve RNA-seq, small RNA-seq, and ChIP-seq analysis.
  2. Do we need an additional CPU? (Assuming we will be analyzing at least 2 RNA-seq experiments at any given time and will have additional users (2-3) logged on and trying to analyze their own data.)
  3. It is my understanding that the greatest limiting factor in computational requirements for NGS analysis is I/O. At this point, is there any advantage to having a GPU versus CPU when it comes to NGS analysis?

Thanks in advance!

ADD COMMENTlink modified 3.0 years ago by Biostar ♦♦ 20 • written 4.6 years ago by eb090610
2
gravatar for Devon Ryan
4.6 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:
  1. Depends on the aligner you and your users plan to use. If you have STAR in mind, then you might want more RAM for concurrent processes.
  2. Probably, 8 cores is more appropriate for a single-user system.
  3. Your understanding is correct and no, trying to use a GPU won't help you much (in fact, there isn't a lot that's GPU-based at the moment).
ADD COMMENTlink written 4.6 years ago by Devon Ryan91k

Thanks, Devon.

For our purposes, we will primarily be using Bowtie. One of my colleagues is currently running cuffdiff on 16 c.elegans samples (15-16M reads/sample), and it looks as though it's stalled at the 'Processing Loci' step with 98% of the memory being used. This is our first attempt at using these workstations for an RNA-seq analysis, so we aren't sure what to expect.

ADD REPLYlink written 4.6 years ago by eb090610

That should suffice for a few instances of bowtie, it's slow and doesn't support spliced alignments (i.e., no aligning to the genome), but I guess it'll work.

The cufflinks issue doesn't surprise me, it's a really slow program.

ADD REPLYlink written 4.6 years ago by Devon Ryan91k
1
gravatar for Damian Kao
4.6 years ago by
Damian Kao15k
USA
Damian Kao15k wrote:

I am currently using a 64gb RAM machine with 32 cores. I find that's enough to do your standard mappings to genome/transcriptome and down-stream analysis. 

However, that might not be enough if you plan to assemble transcriptomes or genomes. I've been using Amazon ec2 instances for a lot of my work. The high memory 244gb ram instances should be fine for most transcriptomes. I've also been using starcluster to setup ad hoc ec2 clusters and running abysss on that to assemble genomes. Here is one of my ipython notebooks on how to setup starcluster on ec2 and run abyss (http://nbviewer.ipython.org/github/damiankao/phaw-genome/blob/master/03_assembly.ipynb)

If you have the money, I highly recommend supplementing your hardware needs with AWS. I can usually find a high ram instance (r3.8xlarge) or high computation instance (c3.8xlarge) for ~35 cents an hour (spot instances) somewhere in EU West or US East.

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by Damian Kao15k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1699 users visited in the last hour