Entering edit mode
5.3 years ago
stonefaarma
•
0
Hi,
I gonna buy a Linux box to run psiblast (blast+ package, nr database, typically one query at a time) searches locally. Any advices, what is the best way to spend money? To be more specific, should I choose - maximum number of cores - maximum clock rate - maximum RAM - maximum disk speed, e.g. PCIe SSD ?
To some extent, the question is about what is the limiting part of blast-ing. I would assume it's not memory, because nr database is way too large to fit into memory. I did a few tests on a local machine with -num_threds
wasn't any help actually. So is it the database loading the bottleneck?
Not correct. With
nr
database you would need to have a certain amount of free RAM to be able to do psi-blast search. So get the most memory (32+ GB to be safe) that your budget allows for (have you considered using blast in cloud using Amazon AMI?) You may actually want to switch todelta-blast
(DELTA-BLAST constructs a PSSM using the results of a Conserved Domain Database search and searches a sequence database.) since it works rather well.I'd suggest buying a system that is generally usable for other bioinformatics as well - and not focus too much on one application... The general answer will be: get yourself as much memory as you can afford. But using cloud services could lead to a lot more flexibility.
cloud is too expensive. E.g. I don't pay for electricity
@cpad0112 I need something versatile, so I will buy a Linux box. I use other software, too.
I let me ask this way: since 32GB ram cost about the same as 1T PCIe SSD, should I buy 32GB RAM + PCIeSSD or 64 GB RAM without PCIe SSD?
There is no substitute for adequate RAM. If you don't have enough you simply would not be able to do a particular analysis.
Consider what you're going to use this machine for. What kind of jobs you're going to run, how much memory they need, where the i/o bound operations are and how much local storage space you'll need.
In my opinion, if you need to go above 32GB RAM, you're also very likely going to deal with jobs that will need more than 64 GB so the questions are: how often is this going to happen and how are you prepared to deal with the cost/inconvenience of going for another system (i.e. cloud or HPC) in such cases?
Concerning storage, given a constrained budget, you'll have to trade speed (SSDs) for space (HDD). Maybe this paper can help you come to a decision.
NCBI blast server is a FPGA (IMO). Buy an FPGA.