Cloud Computer Cluster VS Local Compute Cluster for RNAseq analysis
3
0
Entering edit mode
6.8 years ago
Chen Mor • 0

Hey everyone!

What are you thoughts about using a local cluster VS a cloud based one for doing RNASeq analysis? Any pros and cons you can share from you own experiences?

Best, Chen

RNA-Seq cloud compute cluster • 2.0k views
ADD COMMENT
0
Entering edit mode

Can't think of anything significant which would make one better than the other. Accessibility usually gives the edge to cloud based set ups (not everyone has the luxury of a private server or cluster). If you use cloud based VMs, you have the bonus of the server being 'all yours' for a while, so you can abuse it somewhat. Really depends what you need/already have.

ADD REPLY
1
Entering edit mode
6.8 years ago

Cost-wise choice would depend on how you get charged for using your local cluster and associated storage. For one-offs and short-term projects a cloud-based solution may be cheaper but for regular use in the long run, a local cluster tends to be cheaper (especially when taking into account mistakes, bugs ...). Cloud-based solutions may have a cost in terms of data transfer and upload/download of large amounts of data can be significantly slow (and may only be possible by using something like Amazon's snowball or Amazon's snowmobile).
The main advantage of cloud-based storage would be for sharing data with people outside your institute.

ADD COMMENT
1
Entering edit mode
6.8 years ago
h.mon 35k

What kind of analyses do you need to run? For differential gene or transcript expression, the latest methods (such as Salmon or Kallisto) are so fast and light on resources that a regular laptop can perform them quickly, making cluster and cloud resources unnecessary. The constraint is the size of fastq files - do they fit on your disk or not?.

See some discussions and examples here, here, here and here.

ADD COMMENT
0
Entering edit mode
6.8 years ago
GenoMax 141k

Take into account local security policies at your institution/company. If that policy does not allow you to use external/cloud based resources then your choice would be limited to using local resources. If you work with human data (or data subject to privacy restrictions) that will add another layer of complexity and may require you to have specific agreements with the providers (e.g. if you use Amazon cloud then you may have to ask them to keep your data in a certain geographical jurisdiction).

That said, if you needed to get ~5000 samples analyzed in a week there is simply no substitute for using a cloud based provider like google compute/amazon AWS. Cost would be (relatively) inexpensive (when considering time/infrastructure) when you can dial up thousands of cores on demand.

ADD COMMENT

Login before adding your answer.

Traffic: 3045 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6