Running Blast On Amazon'S Elastic Mapreduce
10.9 years ago
Anjan ▴ 830

Hello, I need to extract 10 million (yes, 10 million) sequences from the blast database -nt using the blastdbcmd command. Needless to say this will take days on my rinky-dink desktop ( In an earlier I could extract ~ 2 million sequences in two and a half days). Does anyone have experience running Blast on Amazon's cloud? Is the BLAST suite of programs pre-installed on the Amazon cores ( I presume no.). This cloud rookie would really appreciate some pointers. Thanks, Anjan

blast cloud
10.9 years ago
Mndoci ★ 1.2k

There is a BLAST AMI out there, provided by NCBI. They update it regularly You can grab the latest AMI id from here

Edit: Didn't read the specific title of the post. People do run BLAST with Elastic MapReduce, but I am not aware of any canned version of the top of my head. Will update if anything comes to mind.

10.9 years ago

Installing blast on a linux system is a fairly straightforward matter so that shouldn't be the problem.

IMHO you may not need to concern yourself with the MapReduce either just rent two three systems with lots of CPUs and run the scripts you already have.


