Forum:Tower Server configuration for NGS data analysis
4
0
Entering edit mode
12 weeks ago
harasharan • 0

Hi, I am in teaching profession, We would like to establish a small Bioinformatics unit in our college. Our works focuses on NGS data analysis (like SNP calling, transcriptome data analysis, WGS, metagenome data analysis, single cell RNA seq analysis). Could any one suggest the configuration for a tower server for this purpose. Thank you

NGS • 792 views
ADD COMMENT
2
Entering edit mode

Personally, I think that one single central server is not necessarily the best option. Much of the analysis can be done on relatively inexpensive PCs or workstations that will reliably run for many years without all the overhead that comes with running and sysadmining a server. Check whether your college maybe has some central computation servers for the heavy preprocessing like alignments, and then consider to do the actual hands-on analysis elsewhere. SOmeone needs to sysadmin the server if you buy it yourself, keep that in mind. Alternatively, depending on throughput, external cloud services might be economic for preprocessing tasks.

ADD REPLY
2
Entering edit mode

This is going to depend on scale. Probably easier to maintain 3-4 workstations used by research students, who can do some of the maintinance themselves. Competely different if its 100 undergrade: ensuring the proper functioning of a classroom of, say 30 workstations - making sure they all have exactly the same versions of everything, nothing is broken, everything is up to date, there are no security holes etc is a major undertaking.

We teach classes of (respectively) 600 students the basics of bioinformatics (alignment, annotation, quantification, assembly and SNP calling), similar numbers the basics of statis in R, 30ish students R-based downstream bioinformatics (DESeq2 etc) and 20ish students commandline, posix processing tools and python.

The best solutions we have come with for these are, respectively:

  • Running our own galaxy server for basics of bioinformatics (24 cores, 64GB RAM, but we only do bacterial stuff) (600 Students, but not all at once)
  • Posit-cloud for basic R and stats (600 students)
  • R running on university workstations (managed by the IT department) or on their own laptop for bioconductory stuff (30ish students)
  • A linux server we maintain for commandline stuff similar to the above server, but again, only for training purposes.

None of this infrastucture can deal with proper research level stuff. For the 10 or so students a year who need that, we use the university HPC cluster. I also have 1-2 workstations for long-term research students based in my lab, which is a small enough number I 'm happy to look after them myself.

ADD REPLY
0
Entering edit mode

Thank you very much for the valuable suggestions

ADD REPLY
2
Entering edit mode
12 weeks ago

The answser to this is going to depend completely on the budget available and the balance of tasks: You say you want to do WGS, SNP calling, transcriptome analysis, metagenome analysis and single cell RNA analysis, but these are very different tasks requiring different amounts of power, and even require different specs depending on which species. .

The system recommended by Prash could handle all thsoe tasks, in any organism, easily, but would be massively overkill if what you were really mostly doing was RNA-seq analysis, or bacterial genome analysis. Probably overkill for single cell RNAseq analysis.

I also think the costing is quite optimistic, at least by UK prices. The biggest tower I could spec out with Dell is 64Cores, 512Gb RAM, and 184TB disk, but that came in at over £100,000 (105 lakhs), although that mostly comes from the HDD. Without the HDD, its only £24,000 GBP (~25 lakhs INR).

ADD COMMENT
1
Entering edit mode
12 weeks ago
Prash ▴ 270

Dear Harasharan

May I suggest 1 TB RAM, 64 processors, L3/L4 cache with 40TB HDD which you could get for ca. 20 lakhs INR

Prash

ADD COMMENT
1
Entering edit mode
12 weeks ago

It really depends on the number of students in the batch and the budget in hand

I would suggest in that since there will be multiple students, you setup small workstation unless you are planning to do heavy bioinformatics analysis everyday.

My suggestion is to buy multiple units of these workstations (Dell precision 5820 Tower (0738) More info here)

16 cores, 32 Gb and 1-2 Tb of HDD. 1 workstation can be shared within 2-3 students at least. That way everyone benefits in limited budget. It will costs around INR 1.25 lakhs per workstation

ADD COMMENT
0
Entering edit mode
9 weeks ago
emmanouil.a ▴ 120

reading above, about what you want to analyse, could be good in my experience to create a small cluster. You can buy 2 workstations, something like: total of 2x250 or 2x500 RAM (10-100Gb per analysis), total 2x40 or 2x80 cores (4-32 threads per analysis), 4-8 HDD 8Tb each (for storage, I suppose that you are going to analyse 1 chromosome and not all chr in a WGS for example, for teaching), 4-8 SSD (for high speed analysis). If in the future you need more power you can add extra nodes (workstations) in the cluster. In a first moment you could use each workstation as single PC/server and share it with a group of students. In the future you can connect them and create a cluster, if the group will be bigger. If the tower is big ("the box") with extra slots (eg. 8 instead of 4), you can add in the future extra HDD/SSD or also extra RAM or pre-adapt for a GPU without the GPU, without buy a new workstation. A cluster or multiple units could need a system administrator, while a couple of good/big workstations is ok with someone that likes to spend some time with Linux OS. You could also, in those 2 workstations, create many small virtual machines, in this way they can also learn about installations, etc. without break the main OS. If you have 1 workstation and it breaks, then you are lost for few days ... if you have two, if one break you have the other. I suggest a tower with hardware RAID (especially for redundancy) or you can create a software RAID ... in case a tower with 10G card for multi-download (if you have 10G internet line).

ADD COMMENT

Login before adding your answer.

Traffic: 1656 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6