Submitting several BLAST queries using NCBIWWW at once
1
0
Entering edit mode
8.0 years ago
pawlowac ▴ 80

Hi everyone,

I am running blasp through NCBIWWW in biopython and I need to blast 50-100 sequences at a time. Right now, I am just going through the list one by one. I would like to submit several of these at once.

Is there a way to do this?

ncbiwww blast biopython • 2.3k views
2
Entering edit mode
8.0 years ago
skbrimer ▴ 710

Hi pawlowac,

I had a vary similar question a couple months ago ( Using Biopython and BLAST+ to automate de novo viral contig sorting ) and what Peter says in it is true. The short answer is you do not need to use biopython, you can just use the standalone blast function and use your file that has you sequence in it as the query. It works with any amount it will just take some time.

0
Entering edit mode

Thanks for the Answer. I had first used BLAST+ to do this, but kept getting timeout errors. I tried again after your suggestion (with the exact same command) and it works great now. Must be the NCBI connection being unreliable as always.

0
Entering edit mode

Great, I'm glad it worked for you. :)

0
Entering edit mode

Ok, I take it back. It worked once, but now it says CPU limit exceeded. There was 150 proteins I was trying to blast...

0
Entering edit mode

You can limit the amount of results in the search parameters by using the 'max_target_seqs' flag. I think the manual has the default set like 500 or something sure high. If you only need a few close hits you can run it with a determined number. For mine I was only concerend with the most exact match so I run it with

-max_target_seq 1
0
Entering edit mode

Unfortunately I need the diversity and there is significant overlap in results between the sequences. I end up parsing the XML results using biopython and grabbing sequence ID with certain conditions and then check for duplicates before using efetch to grab FASTA files. Oh well, back to the drawing board.