Question: Advice on workstation setup for single cell RNAseq alignment mapping differential expression analysis for a newbie
1
gravatar for yonglin489
7 days ago by
yonglin48910
yonglin48910 wrote:

Hello All,

First time poster here. Our lab would like to start doing some scRNAseq work in our lab, we have no clue about hardware requirements. We have looked at some workstations, and below is our potential setup. If anyone could comment on the setup as far as what it can handle in the context of scRNAseq in mouse and human samples, such as size of each sample, number of samples, time it takes for processing differential expression analysis, etc.. we would really appreciate it.

Hardware Specs:

Dell Precision 7920 Tower

Intel Xeon Gold 6130 2.1GHz, 3.7GHz Turbo, 16C, 10.4GT/s 3UPI, 22MB Cache, HT (125W) DDR4-2666

Windows 10 Pro for Workstations (4 Cores Plus) Multi - English, French, Spanish

NVIDIA® Quadro® P2000, 5GB, 4 DP (7X20T)

128GB 8x16GB DDR4 2666MHz RDIMM ECC

Intel Dual Band Wireless AC 8265 (802.11ac) 2x2 + Bluetooth module

Operating System (Boot) Drive: SATA/SAS Hard Drive/Solid State Drive

Hard Drive Controller: Integrated Intel AHCI SATA chipset controller (8x 6.0Gb/s), SW RAID 0,1,5,10

1st Hard Drive: 2.5" 512GB SATA Class 20 Solid State Drive

2nd Hard Drive: 3.5" 4TB 5400rpm SATA Hard Drive

RAID for HDD/SSD & Front PCIe NVMe SSDs: No RAID

Other specs I have no a clue what use they're for, but they're part of the customization options on Dell's website:

Teradici Remote Workstation Access Host Card: No Remote Access Host Card

Network Cards: No Add-In Network Card (Integrated NIC only)

PCIe I/O Cards: Not Selected in this Configuration

Serial Port/PS2 Adapter: None

Many Thanks! YL

hardware rna-seq computer • 117 views
ADD COMMENTlink written 7 days ago by yonglin48910
4
gravatar for genomax
7 days ago by
genomax67k
United States
genomax67k wrote:

Other specs I have no a clue what use they're for, but they're part of the customization options on Dell's website

Those should not have any effect on your intended use.

2nd Hard Drive: 3.5" 4TB 5400rpm SATA Hard Drive

This is not good. In this day and age you should not be using a 5400 rpm drive for anything but archival storage. Certainly not for data analysis.

Intel Xeon Gold 6130 2.1GHz, 3.7GHz Turbo, 16C, 10.4GT/s 3UPI, 22MB Cache, HT (125W) DDR4-2666

If that was the top of the line CPU you chose, I would suggest cutting back on that (get something slower and live with a bit of time it will add to the analysis each time) and get better/bigger/faster secondary storage with that money.

Windows 10 Pro for Workstations (4 Cores Plus) Multi - English, French, Spanish

With Windows you are not going to get far unless you intend to purchase a commercial software product that runs only on windows. You may as well not get windows and go with linux.

such as size of each sample, number of samples, time it takes for processing differential expression analysis,

You should give us an idea how many samples you expect to produce/process per month per year. If this is a one time thing you can look at going the cloud computing route and pay for what you end up needing.

How long it may take to process one sample is not a very useful metric. With 128G RAM you should be able to process samples of different types once you address issues I pointed out above.

ADD COMMENTlink modified 7 days ago • written 7 days ago by genomax67k

NVIDIA® Quadro® P2000, 5GB, 4 DP (7X20T)

not sure whether you have to spend any money a GPU card ? I don't think it's a advantage (let alone a necessity) for this kind of analyses.

ADD REPLYlink written 7 days ago by lieven.sterck4.8k

Thanks for the comment. Yes, from what I gathered searching for RNAseq information online lately it seems that GPU doesn't really matter, this information was just part of the default setting on the Dell computer we looked at.

ADD REPLYlink written 7 days ago by yonglin48910

Hi genomax, thanks for taking the time to answer my question.

For the hard drive I only considered SSD for the operating system and software. Perhaps I should choose SAS HDD or SSD for the second hard drive as well? Or is it the rpm that matters? I actual have no clue how the type of secondary storage plays into RNAseq data analysis.

Regarding the operating system, My lab wants to purchase SeqGeq as well from FlowJo, which is only available for windows or mac. As many other software we use are compatible with windows or mac, we would like to use a windows or a mac. My thinking was that I may be able to dual boot windows and linux, or use the windows subsystem for linux. In essence, we'd like to have a more versatile workstation not too constrained by operating system. Perhaps many software are indeed available in linux to substitute what we currently use, we'd like to minimize the learning curve.

Regarding sample size, right now we have 2 sample sets, one has 2 samples, the other has 4 samples. Either 3000 or 6000 cells per sample. As far as how many samples we will do in the future, I truly don't have a guess on that.

Thanks again.

ADD REPLYlink modified 7 days ago • written 7 days ago by yonglin48910

For your data storage you should have reasonably fast disks (ideally a small RAID array of them) since your analysis will become input/output bound (CPU waiting for delivery of data to work on). Large SSD's get expensive fast so fast SATA disks would be ok.

Sounds like you do need to have Windows present which is fine. Are you going to use 10x or something else? 6 samples is not a lot.

ADD REPLYlink modified 7 days ago • written 7 days ago by genomax67k

Our samples are processed by 10X genomics, is that what you mean by 10x?

ADD REPLYlink written 7 days ago by yonglin48910

Our samples are processed by 10X genomics, is that what you mean by 10x?

ADD REPLYlink written 7 days ago by yonglin48910
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1654 users visited in the last hour