Question: Why does NCBI BLAST Standalone returns less hits than Biopython qBLAST?
0
gravatar for philipp
3.3 years ago by
philipp10
philipp10 wrote:

Hi there,

I set up the NCBI BLAST standalone on my computer and downloaded nt.[01-39].tar.gz as my subject database. When a query an example sequences with the standalone BLAST I receive less hits than online or when I use biopython qBlast (result_handle = NCBIWWW.qblast("blastn", "nt", record.seq, expect=10, hitlist_size=100000))

When I use qBlast or the online tool I have 2564 hits compared to 506 hits when I use the standalone BLAST (blastn –query mysequence.txt –db nt –out mysequence_vs_NT.txt -outfmt 17)

Interestingly I get 1623 hits if I use the parts of the database one by one. So it doesn't match with the online hits but is "better" than the standalone result. (blastn –query mysequence.txt –db nt.[0-39] –out mysequence_vs_NT[0-39].txt -outfmt 17)

How can I get all hits with the standalone BLAST? I appreciate all your help and input!

Thanks, Philipp

hits blast biopython qblast ncbi • 1.4k views
ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by philipp10

It is not the answer to your question, but why do you use -outfmt 17? It should be 7, probably...

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by natasha.sernova3.7k

compared to 506 hits when I use the standalone BLAST

seems like blast is restricted to report only 500 hits by default. Blast has several command line options which can be tuned to return more hits.

ADD REPLYlink written 3.3 years ago by piet1.7k

Thanks Piet, it works when I set -max_target_seqs to a very high number.

@Natasha, I'm using the outfmt 17 option. I don't know if it's new but it creates a list of matches in the database.

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by philipp10

Don't think 17 is a valid option. Blast is probably using just 1.

0 = pairwise,
1 = query-anchored showing identities,
2 = query-anchored no identities,
3 = flat query-anchored, show identities,
4 = flat query-anchored, no identities,
5 = XML Blast output,
6 = tabular,
7 = tabular with comment lines,
8 = Text ASN.1,
9 = Binary ASN.1
10 = Comma-separated values
11 = BLAST archive format (ASN.1)
ADD REPLYlink written 3.3 years ago by genomax75k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 661 users visited in the last hour