I am in the process of building a computer. My resources are limited but is an I5 with 16gb, 500gb sata hdd ok for tools like bowtie2, bedtools, picard, etc... I am planning to put 64-bit ubuntu 14.04.3 on it as well. The source of the data will be genomic resequencing of the medical exome (4500 genes) Any thoughts/suggestions. Thank you :).
How limited are your resources? I have a couple of suggestions from experience:
A 500 GB SATA, unless you have a large NAS or other storage device you are routinely putting data on isn't nearly enough. Even if you do have a NAS It still is pretty small by the time you get various reference and annotation files downloaded, and start working with multiple samples. Assuming you're not just going to be working on a single exome. You really need to bump that up to a few TB If you don't have any other sort of place you are backing up key data too, you probably also want to consider some sort of RAID set up. Losing data sucks, and drives WILL fail. And usually at the worst possible time too. IMO Western Digital is probably the vendor of choice right now for drives. Unless speed of the disk is a major bottleneck WD Blacks are good for desktop drives and workstations. WD Re4 is an enterprise level drive that is pretty similar in price (4TB is about $249 US). As suggested a SSD for the Operating system, key resource, files, etc is also very nice. 120GB is about the sweet spot for price/performance right now.
16 GB of RAM will work... barely. I am currently using a workstation with 16 GB after coming from a machine with 48 GB. I have to remember because even though I have the same number of available threads as on the old machine I often have to run with less if I am passing 4 GB of RAM max to a thread in the GATK toolchain for instance. It will work but if you can get more, get more.
You should really look at a workstation with the Xeon processors. The performance difference is huge, even with a similar number of cores and clockspeed, between a Xeon and i5.