Computational infrastructure requirement for the development of genomic data analysis framework
0
0
Entering edit mode
5.9 years ago
shuksi1984 ▴ 60

What are the requirements (computational infrastructure) to develop an application for the analysis of whole genome, exome and transcriptome data? Example: Galaxy like application.Would like to know the requirement for both cloud and local server.

next-gen sequencing • 833 views
ADD COMMENT
0
Entering edit mode

This is dependent entirely upon the exact application and the number and size of samples.

ADD REPLY
0
Entering edit mode

Do you really mean software development ? Or do you want to analyze genomic results ? Most developers will only work on lightweight dev machines with small test datasets. Full whole genome datasets (the biggest) might take 24-48 hours on modern hardware, even with 56 cores, SSDs and 10Gbit networks.

ADD REPLY
0
Entering edit mode

We want to develop an application to analyze genomic results. Hence, to start with, is working on the infrastructure. Lets say, we get 2 WG, 3 WE, and 3 RNAseq sample everyday.

ADD REPLY
0
Entering edit mode

Right, and doing what exactly to analyze them? How big are the files? What are your time constraints? Etc. Typically for that amount a day you're going to want a cluster backing things just to ensure that everything is done before the next sample rolls in (unless the WGS coverage is low).

ADD REPLY
1
Entering edit mode

Simple criteria.

  • 2 servers (more is better) + 1 VM head node.
  • SLURM job scheduler
  • 50-100TB HD space (NFS for allow addition of further servers)
  • Server config -48-56 core, at least 256 RAM, 10 Gbit Network ports
  • 10Gbit switches
  • Backup to tape/ cloud

If you have more money go for more servers and or a big SSD scratch of >10 TB

Hope that helps.

ADD REPLY

Login before adding your answer.

Traffic: 1971 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6