Exome data analysis computing resources requirement query
1
0
Entering edit mode
10 weeks ago
1769mkc ★ 1.3k

I wanted to get an idea for exome analysis data 400 per month, from raw fastq files to final output, clinical grade data. What sort of computational resources would be required.

Any suggestion or help would be really helpful.

exome • 680 views
ADD COMMENT
2
Entering edit mode

I don't think you will get a precise answer since details are shallow. Clinical grade means you need to ensure data protection and long-term storage. 400 per month means about 10-20 a day, so we're way beyond a normal workstation but rather a few server nodes. That again requires proper administration to comply with mentioned legal restriction towards data security and privacy. It's obviously quite an investment you need to take, so I recommend getting on contact a) with your legal department to check requirements on storage and data protection from their side, b) to your institution (university/hospital/center etc) what they can even provide on existing resources, especially storage, and c) what even the budget is, to estimate whether this is even feasible.

ADD REPLY
2
Entering edit mode

This is a good case for using cloud compute since you could potentially analyze all 400 exomes in parallel easily (if you are so inclined)

IF (in addition to things noted by ATPoint)

  1. local IT and information security policies allow use of cloud resources
  2. you have the necessary budget
ADD REPLY
2
Entering edit mode
9 weeks ago

What others have said, but also

  • whats your approx budget?
  • Maybe if low budget you can get a long way by using local SSDs on workstations
  • ie. 2-3 workstations withs 48+ threads (AMD Ryzen gives a good bang for your buck) and 128, better 256 GB RAM
  • local SSDs are cheap and very fast. Best to get minimum 4-8 TB locally.

If you have a big budget, go for servers with >512GB RAM and AMD Epyc processors. Then a linked SSD shelf or local SSDs on each server

Configure al using Ansible (or chef, puppet if preferred) to save time and increase standardization and ease additions to your fleet if you expand in future.

ADD COMMENT
2
Entering edit mode

local SSDs are cheap and very fast

And keep in mind that they will wear out, especially if you are writing TB's of data regularly.

ADD REPLY

Login before adding your answer.

Traffic: 3440 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6