Question

Server Requirements for long read data analysis

0

Entering edit mode

5 months ago

genomics_student • 0

Hello everyone,

I would like to ask what are the minimum to very good server specifications for setting up a server suitable for long read data analysis. I have only analyzed short read sequencing data before, so requirements such as good GPU are not much of a problem. If your lab also uses long read data, I appreciate your inputs or suggestions. I appreciate all your help!

longreadserver longreadanalysis serverrequirements Server • 1.0k views

ADD COMMENT • link updated 5 months ago by colindaven 8.1k • written 5 months ago by genomics_student • 0

score 2 · Answer 1 · 2025-06-04

Like always, it depends on what you want to do and at what scale. The most resource-intensive step determines the minimum requirements of the system. Do you want to do the basecalling of ONT data? Then, the system requirements for dorado will shape GPU support and vRAM requirements of your system, including the necessity of CUDA (nvidia) support. (Indeed from the description it looks like I could do the basecalling on a MacBook Air) https://dorado-docs.readthedocs.io/en/latest/ Otherwise, it may be assembly with Flye or whatever you want to use. If you already have a system for short read analysis (with a minimum of 128-256GB RAM) and enough storage, it may be sufficient to add CUDA/GPU support for the bassecalling step to it. Also, if you have a machine already, it is worthwhile just trying out some standard NF-core pipelines (https://nf-co.re/genomeassembler/1.0.1/) with a public dataset of the size and organims you will be dealing with.

score 2 · Answer 2 · 2025-06-05

Basecalling - https://github.com/Kirk3gaard/2025-Crowdsource-GPU-basecalling-stats Note that apple silicon does not seem to work well these days. Go for an nvidia card
server or workstation ? Depends on your budget, consumer cards like the 4090 etc are a lot cheaper than the data center A100 H100 etc. 24GB GPU Ram is enough for basecalling, but insufficient for dorado-correct (32GB+)
CPU - go for AMD for the number of hyperthreaded cores, they work very well
- RAM - as much as possible, min 128 GB, more is better

And yes, please say what applications you have in mind, and try some public datasets.