I wanted to get an idea for exome analysis data 400 per month, from raw fastq files to final output, clinical grade data. What sort of computational resources would be required.
Any suggestion or help would be really helpful.
I wanted to get an idea for exome analysis data 400 per month, from raw fastq files to final output, clinical grade data. What sort of computational resources would be required.
Any suggestion or help would be really helpful.
What others have said, but also
If you have a big budget, go for servers with >512GB RAM and AMD Epyc processors. Then a linked SSD shelf or local SSDs on each server
Configure al using Ansible (or chef, puppet if preferred) to save time and increase standardization and ease additions to your fleet if you expand in future.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I don't think you will get a precise answer since details are shallow. Clinical grade means you need to ensure data protection and long-term storage. 400 per month means about 10-20 a day, so we're way beyond a normal workstation but rather a few server nodes. That again requires proper administration to comply with mentioned legal restriction towards data security and privacy. It's obviously quite an investment you need to take, so I recommend getting on contact a) with your legal department to check requirements on storage and data protection from their side, b) to your institution (university/hospital/center etc) what they can even provide on existing resources, especially storage, and c) what even the budget is, to estimate whether this is even feasible.
This is a good case for using cloud compute since you could potentially analyze all 400 exomes in parallel easily (if you are so inclined)
IF (in addition to things noted by ATPoint)