Forum: Bioinformatics computer upgrade help
0
gravatar for truebeliever24
4 weeks ago by
truebeliever2420 wrote:

Hi all,

I am trying to upgrade my Windows 10 computer so that I can work with genomic data/do analyses with whole genome data, manipulate large text files (several gigabytes), edit figures that are several MB in Illustrator. With my current setup (below), I'm unable to do these things.

My setup:

Processor: AMD Ryzen 7 3700x 8-core processor, 3.60 GHz
RAM: 56.0 GB (maxes out at 64GB)
SSD: 250 GB (I try to run my analyses off of this drive)
HDD: 2 TB
GPU: AMD Radeon RX 5700

Please help me point out which areas need to be upgrade to be able to do the things I mentioned. Thank you for your help!

forum upgrade computer • 222 views
ADD COMMENTlink modified 4 weeks ago by h.mon30k • written 4 weeks ago by truebeliever2420
1

That's already a pretty solid set up (though the SSD is a little on the small side) for a standard computer.

There are no real upgrades you could make on standard consumer hardware that will really benefit you if what you have there is already insufficient. RAM in particular is going to be a bottleneck, and generally consumer hardware, even top tier stuff, tops out at around 128GB or maybe 256GB in some cases. Our lab workstation (which is about 7 years old) by comparison, has >350GB RAM, and about 40Tb of storage.

I see no reason why your current setup can't handle images that are a couple of MB in Illustrator. I can do that on my laptop...

As others have said, rather than throwing good money after bad, you would be better served by changing the way you work. Do away with GUIs wherever possible. Utilise cloud services for the odd super strenuous job.

ADD REPLYlink written 4 weeks ago by Joe17k

I am thinking of making the following substantial upgrades:

AMD Ryzen Threadripper 3970x 32-core processor

MSI Creator TRX40 sTRX4 AMD motherboard

EVGA GeForce RTX 2080 Ti XC ULTRA GAMING Video Card

With this setup, I'd also be able to have up to 256 GB of RAM. There are a couple of programs I need to process my data using a GUI, and that GUI is also available via Linux. Might this help me perform the above analyses as well as other programs that require a GUI and that I can't use terminal for?

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by truebeliever2420

This is going to be an exceedingly expensive upgrade. Threadripper CPUs alone are several thousand dollars, and 256GB of DDR4 RAM will be very expensive, especially because AMD's latest infinity fabric based SKUs (like threadripper) work best when you have RAM with very tight CAS latency/timings. A cursory glance at Amazon, and 64GB of fast RAM is already >$500, and this isn't even using ECC RAM (which is not strictly necessary but good to have).

There is very little merit to spending money on GPUs for bioinformatics tasks unless you have some very specific tasks in mind (e.g. writing GPU accelerated code).

You could easily spend $5k upwards on that upgrade I think (depending how much stuff you already have), and you could still encounter workflows that you simply don't have the RAM or storage to handle.

Could you provide a bit more info about what programs you are confident you need (particularly the GUI ones?)

To be perfectly honest with you, this really sounds like wasting your money to me (but you'd have a killer gaming rig haha)

ADD REPLYlink written 4 weeks ago by Joe17k

Upgrading the processor may speed things up a bit but then a limitation of memory (you are almost maxed out already) would kick in. Adding more storage would not help you do larger analyses (since you would be limited by memory again). I don't see much room for cost effective upgrades (in terms of return on your money/allowing you to do analyses a magnitude bigger than what you can currently do.

Have you thought of renting a server as needed on one of the cloud providers? That may allow you to use just enough compute as you need it and only pay for what you use. Otherwise you are starting down the path of upgrading your motherboard (2x sockets or more RAM?). At that point you may as well build a new box.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by genomax87k

Thanks. Would I be able to use a GUI via a cloud-based provider, as in maybe running software that is not built for command line using a remote cluster? The programs I have in mind are not available in command line. I am also considering these upgrades. Let me know if you think they'd help.

AMD Ryzen Threadripper 3970x 32-core processor

MSI Creator TRX40 sTRX4 AMD motherboard

EVGA GeForce RTX 2080 Ti XC ULTRA GAMING Video Card

ADD REPLYlink written 4 weeks ago by truebeliever2420

Yes, using XTerm you can still run GUI tools over a remote connection, albeit with a bit more latency.

ADD REPLYlink written 4 weeks ago by Joe17k

One would need an X11 server on client end as well.

ADD REPLYlink written 4 weeks ago by genomax87k

Based on my experience with similar computer configurations, you should be able to do just about everything you need if you switch to Linux. My experience is with Intel processors instead of AMD and NVIDIA GPUs instead of AMD, but that shouldn't matter. I think this is a healthy configuration for most tasks as long as you install the system and the swap file (I'd go with 80-100 Gb) on SSD. It may seem like a wast to dedicate that much SSD to swap, but it will pay off next time you need to load a database that requires 128 Gb of RAM which you don't have.

It wouldn't hurt to have larger disks and put max memory in it that you can, but this strikes me as plenty for the type of analysis you described.

ADD REPLYlink written 4 weeks ago by Mensur Dlakic6.0k

I tried Linux and it was still rather slow for me (although I had to install it on an HDD-I didn't have adequate space on my small SSD). Whatt do you think of the following upgrades, and then installing Linux (the GUIs I need are available on Ubuntu Desktop)?

AMD Ryzen Threadripper 3970x 32-core processor

MSI Creator TRX40 sTRX4 AMD motherboard

EVGA GeForce RTX 2080 Ti XC ULTRA GAMING Video Card

With this setup, I'd also be able to have up to 256 GB of RAM.

ADD REPLYlink written 4 weeks ago by truebeliever2420

If you're trying to manipulate large text files by opening them directly in a text editor, it's going to be a nightmare no matter what your specs are.

A better bet would be to:

  • Learn some basic python coding and use something like Jupyter notebook to deal with processing text files
  • Install GitBash and mess with the text files using shell scripting (also has the advantage of letting you rename/move a ton of files at once)

Installing Linux will be necessary to use most genomics tools (most aligners and other tools).

Editing files in Illustrator shouldn't be a problem with your setup -- I have 8GB of RAM and it's mostly fine, though a bit clunky for really large files. If you're making figures, perhaps try breaking them up by panel or don't embed rasterized images into the file until the very end (if you're dealing with a lot of microscopy or something).

If you want to get more specific with what tasks are overwhelming your computer people will probably be able to give better recommendations.

ADD REPLYlink written 4 weeks ago by MaxF70

I will give this a try, thank you. I need to be able to remove hundreds of columns of missing data for over 100 individuals, and it's been pretty awful dealing with that. I've tried nano (and that is all I am really aware of), and it's not easy visualizing the data and observing what I need to take out that way.

ADD REPLYlink written 4 weeks ago by truebeliever2420

Any editor is going to struggle with large files. This is why we say you need to change the way you work wherever possible. E.g., do your text manipulations in line-by-line approaches using commandline tools so you aren't sucking up all your RAM.

sed, awk, grep, cut, sort, python and Perl among many more are going to be your best friends here.

ADD REPLYlink written 4 weeks ago by Joe17k

I need to be able to remove hundreds of columns of missing data for over 100 individuals, and it's been pretty awful dealing with that. I've tried nano (and that is all I am really aware of), and it's not easy visualizing the data and observing what I need to take out that way.

I think your best bet is to use data.table with R on ~48 GB RAM machine. You won't need too much RAM if you use the right tools. You could definitely use shell utilities too. Using the most powerful computer to achieve a goal that can be done with a medium powered setup is a waste of resources. In my experience, any bioinformatics program that uses 256 GB of RAM is not doing its job right (or is being used wrong).

ADD REPLYlink written 4 weeks ago by RamRS28k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1599 users visited in the last hour