Entering edit mode
5.9 years ago
shuksi1984
▴
60
What are the requirements (computational infrastructure) to develop an application for the analysis of whole genome, exome and transcriptome data? Example: Galaxy like application.Would like to know the requirement for both cloud and local server.
This is dependent entirely upon the exact application and the number and size of samples.
Do you really mean software development ? Or do you want to analyze genomic results ? Most developers will only work on lightweight dev machines with small test datasets. Full whole genome datasets (the biggest) might take 24-48 hours on modern hardware, even with 56 cores, SSDs and 10Gbit networks.
We want to develop an application to analyze genomic results. Hence, to start with, is working on the infrastructure. Lets say, we get 2 WG, 3 WE, and 3 RNAseq sample everyday.
Right, and doing what exactly to analyze them? How big are the files? What are your time constraints? Etc. Typically for that amount a day you're going to want a cluster backing things just to ensure that everything is done before the next sample rolls in (unless the WGS coverage is low).
Simple criteria.
If you have more money go for more servers and or a big SSD scratch of >10 TB
Hope that helps.