Hi everyone!
I'm trying to find protein orthologs between strains using InParanoid. But when I execute it, I obtain errors. Is this normal? Or am I doing something wrong? The command seems simple...
Thanks in advance!!!
Command:
$ perl inparanoid.pl strain1.fa strain2.fa
Here, a part of the Output with errors:
Trying to run BLAST now - this may take several hours ... or days in worst case! Formatting BLAST databases
Done formatting Starting BLAST searches...
Starting first BLAST pass for strain1.fa - strain1.fa on mar abr 24 20:53:57 CEST 2018 [blastall] WARNING: the -C 3 argument is currently experimental
Starting second BLAST pass for strain1.fa - strain1.fa on mar abr 24 20:54:38 CEST 2018 [formatdb] WARNING: Cannot add sequence number 1 (lcl|1_./tmpd) because it has zero-length.
[formatdb] FATAL ERROR: Fatal error when adding sequence to BLAST database. [blastall] FATAL ERROR: -triphosphate: Database ./tmpd was not found or does not exist
no element found at line 1, column 0, byte -1 at ./blast_parser.pl line 110. [formatdb] WARNING: Cannot add sequence number 2 (lcl|2_./tmpd) because it has zero-length.
[formatdb] FATAL ERROR: Fatal error when adding sequence to BLAST database. [blastall] FATAL ERROR: NC_101010.3_prot_WP_011212121.1_120: Database ./tmpd was not found or does not exist
.....
It must be pretty dubious software if providing error messages is considered as 'normal behaviour' :)
On the more helpful side: can you post the cmdline you're trying to execute?
You're absolutely right! :-P
Okay, I've managed to fix the problem! In case someone else has the same problem, the solution is to simplify the sequence header. To solve this, it is necessary to edit the headers of the fasta file by simplifying them, eliminating any symbol or description of the sequence wrapped in tags and square brackets. And it's all!
lieven.sterck, Thank you for your helpfulness!