Forum: SSD or HHD for genome analysis
0
gravatar for maksjytov.nail
15 months ago by
maksjytov.nail10 wrote:

Hello. I need to analysis mostly human whole exomes and far less frequently human whole genomes. And today, I must choose what is better for these purposes - 2x480 SSD or 2x2TB HHD. The current pipeline for exome analysis is very old and is written in Perl. And I'm going to rewrite it with new software. Please, help me to choose what kind of storage do I need for my goals. Also, this hardware will be used for six months.

sequencing hardware forum genome • 1.2k views
ADD COMMENTlink modified 15 months ago by Nicolas Rosewick9.3k • written 15 months ago by maksjytov.nail10
1

If this is on workstation rather than cluster level, get some SSDs to be able to productively do things in parallel. SSDs are really cost-effective these days. For archive, get a big external drive like 10TB or so to store things once analyzed.

ADD REPLYlink written 15 months ago by ATpoint44k

IMO go for SSD on your laptop and buy a NAS ( a RAID5 25TB NAS costs about 2000€ : a 10TB ~1000€).

ADD REPLYlink modified 15 months ago • written 15 months ago by Nicolas Rosewick9.3k

You could also consider Hybrid drives. I used to have one of those in the past and it felt like the best of both worlds to me: plenty of cheap storage with the HDD and speed with the SSD.

ADD REPLYlink modified 15 months ago • written 15 months ago by Carlo Yague5.5k

The hybrid drives will move frequently used files to the SSD. If you are running a new analysis, all those files will be new, so they will probably be relegated to the HDD.

ADD REPLYlink written 15 months ago by igor12k

Sure, not everything can be put on the SSD of hybrid drives because they have far more limited capacity than full SSD drives. In the context of bioinformatics, it could be worth it for, lets say, the scripts and libraries that are often accessed, small databases, perhaps the indexed genome for read mapping, etc

IMHO, hybrids are a good balance of cost, capacity and performances, but in the end, it all comes down to user needs and budget.

ADD REPLYlink written 15 months ago by Carlo Yague5.5k
2
gravatar for Philipp Bayer
15 months ago by
Philipp Bayer6.9k
Australia/Perth/UWA
Philipp Bayer6.9k wrote:

Interestingly, there's a whole paper about this! https://academic.oup.com/bib/article/17/4/713/2240499/

It looks like some programs sped up significantly, others had no improvement. Personally I'd rather go for space than for speed (so the 2x2TB HDD), but I work with massive plant genomes, I don't know what scale of data you will be working with!

ADD COMMENTlink written 15 months ago by Philipp Bayer6.9k

Most of the time I work with human exome sequence data which has the size from 1 GB to 10 GB (pair-end) and human whole-genome sequence data with the size of 100 gb (pair-end).

ADD REPLYlink written 15 months ago by maksjytov.nail10
0
gravatar for SaltedPork
15 months ago by
SaltedPork110
SaltedPork110 wrote:

If this is for a Desktop/Workstation then I would install your OS and pipelines on an SSD. Once an analysis is complete, move the data onto HDDs for backup.

ADD COMMENTlink written 15 months ago by SaltedPork110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1061 users visited in the last hour