You didn't actually ask about GATK or variant calling, but I see that in a lot of answers (and that is part of what you would probably want to do on the cloud).
I didn't run BWA-MEM on the cloud, but here are some thoughts:
1) I think the best solution depends upon the total number of samples. Right now, I believe you can get a $300 credit to test Google Cloud. So, if you only plan to process a limited number of samples that can essentially be free.
2) For some people, I think precisionFDA may be an acceptable solution (I believe they already have apps for BWA-MEM alignnment and GATK variant calling). If you are willing to contribute data (I created an account with a G-mail address), analysis through DNAnexus will be free.
I have some notes about my experience with precisionFDA in this blog post.
3) If you don't use precisionFDA (and instead use AWS or Google Cloud, post-credit), my limited testing gives me the impression that the costs may be a little higher than you may expect. For example, I spent a few hundred dollars getting used to AWS (although that may still have been less than a formal class), and I have some notes about running DeepVariant on Google Cloud (and AWS) here.
However, the long-term costs are something other users may admittedly be able to better to answer than myself.
I hope this helps!
Is it intentional that this question looks like a homework assignment?
Remember to show your work for partial credit
Is this supposed to be a sarcastic answer?
"Broad has reduced the cost of processing on the cloud from about $45 per genome when the cloud move started to $5 now, and has a target of $3 based on some work currently in progress, Mr. Mayo said." https://blogs.wsj.com/cio/2018/03/12/harvard-mits-broad-institute-powers-genomic-research-in-the-cloud/
Is that pricing available for everyone or only to those who are at the level that Broad does business at?
yes i suspect there are some efficiencies of scale and also negotiated rates
Nope that’s the cost for anyone who uses our pipeline on Google, for a single whole genome, going from unmapped reads to GVCF or VCF, including QC. Nothing to do with scale or pref pricing, except that we benefitted from GCP engineers’ help to optimize the pipeline. Check it out here
that's amazing! good work
refers to this post: https://medium.com/truwl/what-is-the-cost-of-bioinformatics-a-look-at-bioinformatics-pricing-and-costs-1e4c1c3bcb4f