Forum:is Amazon Web Services (AWS) useful?
2
1
Entering edit mode
2.7 years ago
ahadli.farid ▴ 30

Hello everyone,

We are planning to analyze around 70 to 80 BAM files (for SNV detection) in the lab. However, we lack the proper hardware as our PI wants to be sure that we can do the analysis before he invests in anything. Do you think that AWS EC2 T micro would suffice, at least for a brief period of time. Thanks in advance.

Regards, Farid Ahadli.

RNA-Seq bam Forum • 1.0k views
ADD COMMENT
1
Entering edit mode

I suppose you're interested in t2.micro because it's within the Free Tier. Unfortunately, the answer is no.

Is AWS (or cloud computing in general) useful? Absolutely.

If you've decided on AWS as a provider, I think the following article on AWS Batch might be helpful: https://aws.amazon.com/blogs/compute/building-high-throughput-genomics-batch-workflows-on-aws-introduction-part-1-of-4/

ADD REPLY
0
Entering edit mode

This question can be quite tricky to answer. AWS will suffice as long as you have command line unix expertise available and know how to use AWS properly. There can be security considerations along with dynamic costs that can add up quickly.

What program are you planning to use? You would need to use an AWS instance that has the necessary compute requirements as that program.

ADD REPLY
0
Entering edit mode

Guessing an EC2 free tier micro instance is unlikely to meet your needs for genome-scale BAM analysis. As academics, are you perhaps eligible for free credits at AWS or Azure that could get you started.

ADD REPLY
1
Entering edit mode
2.7 years ago

Have you looked into DNAnexus? That's a cloud-based service geared towards bioinformatics; they basically try to act as an intermediary between the basic cloud infrastructure and specific bioinformatics needs. They provide numerous pre-installed tools, for example, and also have a couple of pipelines, which may come in handy.

ADD COMMENT
0
Entering edit mode
2.7 years ago
btsui ▴ 290

I have tried many different computing solutions. AWS is probably the simplest solution for bioinformatic tasks for too many reasons. I won't say it is the cheapest though. You probably need at least t2.medium (4G ram) to run most bioinformatic tools. If there are only 80 bams and assuming only 20G per bam, it will be 160G costing only hundreds dollars per month.

I have written a blog post about this kinda things, reposted it in here, hope it helps: https://brianyiktaktsui.wordpress.com/2018/08/11/buying-computing-infrastructure-vs-adopting-the-cloud/

ADD COMMENT

Login before adding your answer.

Traffic: 1825 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6