Question: step by step BlastX against nr using Amazon cloud
1
gravatar for Farbod
3.2 years ago by
Farbod3.3k
Toronto
Farbod3.3k wrote:

Dear Biostars, Hi

I want to run blastx against NCBI nr for annotating thousands of de novo transcripts. I want to do it as fast as possible.

I have heard that one way is to use Amazon cloud. So, would you kindly help and introduce me any step-by-step manual or resource in this regard?

(e.g: where to begin? quantity of resources I need to rent from cloud and related costs? should I download the Nr database locally or downloading it into Amazon cloud? or uploading my transcriptome assembly into the Amazon? is there any pre-installed and ready to use Cloud for Bioinfirmatics (some BioCloud)? . . . and so on).

Thanks

Notes:

0- I have checked some Biostars posts but could not find any simple guidance.

1- I have no computational resources at home (no server and ...)

2- I do not want to use Blast2GO Pro CloudBlast (why? it is expensive for me!)

blast cloud annotation • 1.0k views
ADD COMMENTlink modified 21 months ago by Abhishek20 • written 3.2 years ago by Farbod3.3k

DIAMOND is basically meant for short reads for metagenome data. If you have longer sequences (say 1kb long), I don't think it will be the right choice.

ADD REPLYlink written 21 months ago by Abhishek20
3
gravatar for GenoMax
3.2 years ago by
GenoMax94k
United States
GenoMax94k wrote:

Have you looked at the NCBI BLAST on Cloud help? There is step-by-step guidance available using links at the left.

That said, if you were looking to do this for thousands of transcripts then you should use DIAMOND instead. Regular blast+ may take too long and cost more.

There is also PAUDA.

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by GenoMax94k

Thanks, Have you tried it? is it efficient (speed and cost)? I know that running local Diamond needs more than 32 GB of RAM. if it could be run on the cloud, it is a fast algorithm.

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by Farbod3.3k
1

Be sure to test a small subset of data first to see what the costs would be. They would likely scale directly in relation to size of your query data and can add up.

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by GenoMax94k
ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by Farbod3.3k

Should be available here.

ADD REPLYlink written 3.2 years ago by GenoMax94k

I agree, DIAMOND is much faster than BLASTX, especially for large datasets.

ADD REPLYlink written 3.2 years ago by Sej Modha4.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2357 users visited in the last hour
_