Blastdbcmd Cant Find Protein Database
2
0
Entering edit mode
11.0 years ago
bioinfo ▴ 830

Here is the command I used:

blastdbcmd -entry_batch input.gi.txt -out file.out -target_only -outfmt %t -db my_protein_database.fasta

I got an error "Error: Entry not found in BLAST database". I have the indexed my database and .phr, .pin, .prj etc files (created by vmatch)are in the same directory where my protein database (my_protein_database.fasta) is but the blastdbcmd is in the user/bin directory. I am a bit suspicious about the index files as it was created by vmatch. I was wondering if there is any other way to build the protein database index using blast function that can be compatible for blastdbcmd. Any suggestions to point out where exacty I'm wrong? i'm using blastdbcmd: 2.2.25+ version.

blast • 6.5k views
ADD COMMENT
0
Entering edit mode

It rather looks like the entries are not found rather than the db itself, are you sure your input file had properly formatted gis, you have to have them in the right format in the fasta header? Could you please try to find some single entries with -entry to see if the database is properly indexed and post the output of blastdbcmd -info.

ADD REPLY
0
Entering edit mode

I just created another index using makeblastdb.

Now as you said;

blastdbcmd -entry 273547854 -out file.out -target_only -outfmt %t -db blast.db.fasta

Error: Entry not found in BLAST database. when I add -info flag it conflicts with -entry ( Error: (CArgException::eConstraint) Argument "info". Conflict with argument: `entry')

this is how the database entries look like:

>gi|154000884|gb|ABS57010.1| CadX [Streptococcus salivarius] [cadX]
MKKDSICQVDVINQQNVTTATNYLEKEKVQKSLRILSKFTDNKQINIIFYLLAVEELCVCDIACLLNLSMASASHHLRKL
ANQNILDTRREGKIIYYFIKYEEIKDFFNQLG
ADD REPLY
1
Entering edit mode
11.0 years ago

The way blast names items is a little finicky to say the least.

It depends on the implicit parsing that takes place and errors will be treated silently. Do a (I don't have blast ready on this system that I am typing this) but would be something like

blastdbcmd -entry all -db yourdatabase -outfmt %g

to check what are the ids actually in the file.

ADD COMMENT
0
Entering edit mode

Yeah it's a nice way to get all the gi's from the database. Just seen the other options with %(a-z) which are very useful. Thanks

ADD REPLY
0
Entering edit mode
11.0 years ago
bioinfo ▴ 830

It's working now. check comments in C: Extracting gene/protein name using gi number

ADD COMMENT

Login before adding your answer.

Traffic: 2370 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6