I am trying to create a BLAST database with protein chain sequences. My FASTA looks like this:
>1fig_1 ENVLTQSPAIMSASPGEKVTMACRASSSVSSTYLHWYQQKSGASPKLLIYSTSNLASGVP ARFSGS >1p8v_2 EADCGLRPLFEKKSLEDKTERELLESYID >5ivx_13 MSHSLRYFVTAVSRPGFGEPRYMEVGYVDNTEFVRFDSDAENPRYEPRARWIEQEGPEYW ERETRRAKGNEQSFRVDL .. etc.
Where the number after the underscore is the entity ID from the mmCIF file. (I am not using chain IDs, because it is more redundant.)
I keep getting the error:
BLAST Database creation error: Multi-letters chain PDB id is not supported in v4 BLAST DB
The problem, I gather, is that makeblastdb seems to expect a maximum of six characters in the id, so if the number after the underscore is double-digit, it fails.
However, given that some pdbs have more that 9 entities, there is a need for me to utilize more characters. Is there a way to get around this limit? (I am currently using makeblastdb version 2.9.0)