Question: Computational infrastructure required for RNAseq analysis, genome annotation and assembly of eukaryotic genome
Dear, I would like some help! I need to implement a computational infrastructure for my routine analysis of bioinformatics. I would like a suggestion on what would be a minimal or optimal configuration for tasks which can include: genome assembly and annotation, analysis rnaseq (transcriptome), then comparative genomics.

EDIT: Post title changed to make it more informative by Ashutosh Pandey.

If you are asking about programs and methods, a good starting point is "GATK's best practices":


what is Genome montattion?

I apologize, it was written wrong.

So if I understand correctly, you are asking for the hardware settings? Or are you asking for the scripts to perform such tasks?

Hi Sam, my question is about hardware settings. ;D.

Please post comment as comments and not as answers.

Thank you Ashutosh Pandey by helpe and information.  Where I need of a structure for routine analysis of RNAseq and genome assembly.   Do you have any sugestion?





You should elaborate little more. See below:

1) Will it be just for your use? Or will it be part of a small research lab or some bioinformatics centre with multiple users?

2) Are you considering buying a multi CPU systems or cluster or you are talking about a desktop with good computing power and space (A few of those analyses can be done on desktop machines with descent RAM and memory)

3) Amount of storage capacities will depend on how much new data you are generating and analyzing. We have a md1000 systems that can take 8 X 500 Gb hard disks (4 TB). It is only used for the purpose of analysis and data storage for 6 months once it is out of the machine. We then need to back up those hard disks once every 6 months as our sequencing facility keeps generating new data.      

