Let's say I have multifasta with protein sequences having internal IDs (integer)
>1234 MGKL...*
I build blast db using:
formatdb -i infile.fa -pF -n someDB
But then, I'm unable to retrieve sequence from db using simple protein id:
fastacmd -d someDB -s 1234
How to define fasta header so I can retrieve sequences easily?
I have noticed formatdb assign internal identifiers (increment int) to my sequences, and orginal ID appears later:
>gnl|BL_ORD_ID|12 1234
Why is that?
I then defined headers as:
>gnl|dbname|1234
but with no effect. Do I have to define headers as >gi|1234 in order to be able to get sequence? Or is there any other way of retrieving sequences from blast db?
och, stupid me! I didn't noticed that parameter. thanks a lot Nagarajan