Blastdbcmd Cant Find Protein Database
2
0
Entering edit mode
9.4 years ago
bioinfo ▴ 830

Here is the command I used:

blastdbcmd -entry_batch input.gi.txt -out file.out -target_only -outfmt %t -db my_protein_database.fasta


I got an error "Error: Entry not found in BLAST database". I have the indexed my database and .phr, .pin, .prj etc files (created by vmatch)are in the same directory where my protein database (my_protein_database.fasta) is but the blastdbcmd is in the user/bin directory. I am a bit suspicious about the index files as it was created by vmatch. I was wondering if there is any other way to build the protein database index using blast function that can be compatible for blastdbcmd. Any suggestions to point out where exacty I'm wrong? i'm using blastdbcmd: 2.2.25+ version.

blast • 5.8k views
0
Entering edit mode

It rather looks like the entries are not found rather than the db itself, are you sure your input file had properly formatted gis, you have to have them in the right format in the fasta header? Could you please try to find some single entries with -entry to see if the database is properly indexed and post the output of blastdbcmd -info.

0
Entering edit mode

I just created another index using makeblastdb.

Now as you said;

blastdbcmd -entry 273547854 -out file.out -target_only -outfmt %t -db blast.db.fasta


Error: Entry not found in BLAST database. when I add -info flag it conflicts with -entry ( Error: (CArgException::eConstraint) Argument "info". Conflict with argument: entry')

this is how the database entries look like:

>gi|154000884|gb|ABS57010.1| CadX [Streptococcus salivarius] [cadX]
MKKDSICQVDVINQQNVTTATNYLEKEKVQKSLRILSKFTDNKQINIIFYLLAVEELCVCDIACLLNLSMASASHHLRKL
ANQNILDTRREGKIIYYFIKYEEIKDFFNQLG

1
Entering edit mode
9.4 years ago

The way blast names items is a little finicky to say the least.

It depends on the implicit parsing that takes place and errors will be treated silently. Do a (I don't have blast ready on this system that I am typing this) but would be something like

blastdbcmd -entry all -db yourdatabase -outfmt %g
`

to check what are the ids actually in the file.

0
Entering edit mode

Yeah it's a nice way to get all the gi's from the database. Just seen the other options with %(a-z) which are very useful. Thanks

0
Entering edit mode
9.4 years ago
bioinfo ▴ 830

It's working now. check comments in C: Extracting gene/protein name using gi number