Forum: SSD or HHD for genome analysis
0
gravatar for maksjytov.nail
14 days ago by
maksjytov.nail10 wrote:

Hello. I need to analysis mostly human whole exomes and far less frequently human whole genomes. And today, I must choose what is better for these purposes - 2x480 SSD or 2x2TB HHD. The current pipeline for exome analysis is very old and is written in Perl. And I'm going to rewrite it with new software. Please, help me to choose what kind of storage do I need for my goals. Also, this hardware will be used for six months.

ADD COMMENTlink modified 14 days ago by Nicolas Rosewick8.3k • written 14 days ago by maksjytov.nail10
1

If this is on workstation rather than cluster level, get some SSDs to be able to productively do things in parallel. SSDs are really cost-effective these days. For archive, get a big external drive like 10TB or so to store things once analyzed.

ADD REPLYlink written 14 days ago by ATpoint24k

IMO go for SSD on your laptop and buy a NAS ( a RAID5 25TB NAS costs about 2000€ : a 10TB ~1000€).

ADD REPLYlink modified 14 days ago • written 14 days ago by Nicolas Rosewick8.3k

You could also consider Hybrid drives. I used to have one of those in the past and it felt like the best of both worlds to me: plenty of cheap storage with the HDD and speed with the SSD.

ADD REPLYlink modified 14 days ago • written 14 days ago by Carlo Yague4.7k

The hybrid drives will move frequently used files to the SSD. If you are running a new analysis, all those files will be new, so they will probably be relegated to the HDD.

ADD REPLYlink written 14 days ago by igor8.6k

Sure, not everything can be put on the SSD of hybrid drives because they have far more limited capacity than full SSD drives. In the context of bioinformatics, it could be worth it for, lets say, the scripts and libraries that are often accessed, small databases, perhaps the indexed genome for read mapping, etc

IMHO, hybrids are a good balance of cost, capacity and performances, but in the end, it all comes down to user needs and budget.

ADD REPLYlink written 14 days ago by Carlo Yague4.7k
2
gravatar for Philipp Bayer
14 days ago by
Philipp Bayer6.5k
Australia/Perth/UWA
Philipp Bayer6.5k wrote:

Interestingly, there's a whole paper about this! https://academic.oup.com/bib/article/17/4/713/2240499/

It looks like some programs sped up significantly, others had no improvement. Personally I'd rather go for space than for speed (so the 2x2TB HDD), but I work with massive plant genomes, I don't know what scale of data you will be working with!

ADD COMMENTlink written 14 days ago by Philipp Bayer6.5k

Most of the time I work with human exome sequence data which has the size from 1 GB to 10 GB (pair-end) and human whole-genome sequence data with the size of 100 gb (pair-end).

ADD REPLYlink written 14 days ago by maksjytov.nail10
0
gravatar for SaltedPork
14 days ago by
SaltedPork100
SaltedPork100 wrote:

If this is for a Desktop/Workstation then I would install your OS and pipelines on an SSD. Once an analysis is complete, move the data onto HDDs for backup.

ADD COMMENTlink written 14 days ago by SaltedPork100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1316 users visited in the last hour