InParanoid blast-all & formatdb Error
4.6 years ago
Izal ▴ 10

Hi everyone!

I'm trying to find protein orthologs between strains using InParanoid. But when I execute it, I obtain errors. Is this normal? Or am I doing something wrong? The command seems simple...

Command:

\$ perl inparanoid.pl strain1.fa strain2.fa


Here, a part of the Output with errors:

Trying to run BLAST now - this may take several hours ... or days in worst case! Formatting BLAST databases

Done formatting Starting BLAST searches...

Starting first BLAST pass for strain1.fa - strain1.fa on mar abr 24 20:53:57 CEST 2018 [blastall] WARNING: the -C 3 argument is currently experimental

Starting second BLAST pass for strain1.fa - strain1.fa on mar abr 24 20:54:38 CEST 2018 [formatdb] WARNING: Cannot add sequence number 1 (lcl|1_./tmpd) because it has zero-length.

[formatdb] FATAL ERROR: Fatal error when adding sequence to BLAST database. [blastall] FATAL ERROR: -triphosphate: Database ./tmpd was not found or does not exist

no element found at line 1, column 0, byte -1 at ./blast_parser.pl line 110. [formatdb] WARNING: Cannot add sequence number 2 (lcl|2_./tmpd) because it has zero-length.

[formatdb] FATAL ERROR: Fatal error when adding sequence to BLAST database. [blastall] FATAL ERROR: NC_101010.3_prot_WP_011212121.1_120: Database ./tmpd was not found or does not exist

.....

You're absolutely right! :-P

Okay, I've managed to fix the problem! In case someone else has the same problem, the solution is to simplify the sequence header. To solve this, it is necessary to edit the headers of the fasta file by simplifying them, eliminating any symbol or description of the sequence wrapped in tags and square brackets. And it's all!