I'm using Unus, which is Perl package for phylogenomic analyses. In this package, the blast-2.2.25 is used because the package uses the formatdb program, as follows:
if (($self->{'program'} eq 'blastn' || $self->{'program'} eq 'tblastx' || $self->{'program'} eq 'tblastn') &&
!(-e $self->{'db'}.".nin" || -e $self->{'db'}.".nal" )){
system($self->{'formatdb'}, '-i', $self->{'db'}, '-o', 'T', '-p', 'F') == 0
or LOGDIE "Error running formatdb: $!";
}elsif(($self->{'program'} eq 'blastp' || $self->{'program'} eq 'blastx') &&
!(-e $self->{'db'}.".pin" || -e $self->{'db'}.".pal" )){
system($self->{'formatdb'}, '-i', $self->{'db'}, '-o', 'T', '-p', 'T') == 0
or LOGDIE "Error running formatdb: $!";
}
However, there is a constant error message that is blocking Unus.
[formatdb] WARNING: Cannot add sequence number 6 (lcl|XamC:6) because it has zero-length.
[formatdb] WARNING: Cannot add sequence number 1 (lcl|Xam_:1) because it has zero-length.
[formatdb] FATAL ERROR: Fatal error when adding sequence to BLAST database.
[formatdb] WARNING: Cannot add sequence number 41 (lcl|Xamc:41) because it has zero-length.
[formatdb] FATAL ERROR: Fatal error when adding sequence to BLAST database.
[formatdb] WARNING: Cannot add sequence number 7 (lcl|XamF:7) because it has zero-length.
[formatdb] WARNING: Cannot add sequence number 144 (lcl|Xam0:144) because it has zero-length.
I inspected the sequences, and they don't have zero-length. Unus is running with 27 genomas of Xanthomonas​.
Also, the input sequences were obtained after using the extract program in glimmer3. A example of a input sequence is:
>orf00002 3568 4905 len=1338
GTGATTGTTTTTAAAGGAAATTTAGGGGCCGAAACCCTGTGTTTACCGCCCTGTTTTCTC
ACAAACAAGCTGTGGATAAGCGAAAGCACCTCCACAGGCCCTATTTTTATCCACATGTTA
TCCCCTGCCTGTCCGGTCATTCCTGGCGGCCATGTCTGCACGGTTTCATGCCGATCCCGT
ATCCTTCGAACCGACCGGCATGCCGGATTACAGCCCAGAGCACACCGATCGATGCATGTA
GTGCGGTTGTCCATTCATCGGCTTCGTCGGTTTCAAACCGTCGAGCTTCATCCCTCCAGT
GCCTTGAATCTGCTGACCGGCGACAACGGCGCGGGCAAGACCAGCGTGCTCGAAGCGCTA
CACCTGATGGCTTACGGCCGCAGCTTCCGCGGGCGCGTCCGCGACGGCCTGATCCAACAA
GGCGCCAACGACCTCGAAGTGTTCGTGGAGTGGAAAGAAGGCGGCGGCGCTGCGGTCGAG
CGGACGCGTCGGGCTGGCTTGCGTCATAGCGGGCAGGAATGGACAGGGCGCCTGGACGGG
GAAGACGTGGCGCAGCTTGGCTCTCTTTGCGCTGCGCTGGCAGTGGTGACGTTCGAGCCC
GGCAGCCACGTATTGATCAGTGGCGGTGGTGAACCCCGCCGCCGTTTTCTGGATTGGGGC
CTGTTCCACGTGGAACCCGATTTTCTAACCTTGTGGCGCCGCTATGCGCGAGCCCTCAAA
>orf00004 5020 7464 len=2445
ATGACCGACGAACAAAACACCCCGCCAACACCCAACGGCACCTACGACTCCAGCAAGATC
ACCGTGCTGCGTGGCCTGGAAGCCGTCCGCAAGCGTCCCGGCATGTATATCGGCGACGTC
CATGACGGCACCGGCCTGCATCACATGGTGTTCGAGGTGGTCGACAACTCGGTCGACGAA
GCCCTTGCCGGGCATGCCGACGACATCGTGGTAAAAATCCTGGCCGATGGCTCGGTGGCG
GTCTCCGACAACGGGCGCGGCGTGCCGGTCGACATCCACAAGGAAGAAGGCGTGTCGGCG
GCCGAGGTGATCCTCACCGTGCTCCACGCCGGCGGCAAGTTCGACGACAACAGCTACAAG
GTCTCCGGCGGCCTGCACGGCGTTGGCGTCTCGGTGGTCAACGCGTTGTCAGAGCACCTG
TGGCTGGATATCTGGCGCGACGGCTTCCACTACCAGCAGGAATACGCGCTGGGCGAGCCG
CAGTACCCGCTCAAGCAGCTGGAAGCCTCGACCAAGCGCGGTACCACGCTGCGCTTCAAG
CCGTCCGTGGCCATCTTCAGCGACGTCGAGTTCCATTACGACATCCTGGCGCGGCGCCTG
CGCGAGCTGTCCTTCCTCAATTCTGGCGTCAAGATCACCTTGATCGACGAGCGCGGCGAA
GGCCGTCGCGACGATTTCCATTACGAAGGCGGCATCCGCAGCTTCGTGGAGCATCTGGCG
CAGCTGAAGTCGCCGCTGCACCCGAATGTGATCTCGGTGACCGGCGAGCACAACGGCATC
ATGGTGGACGTGGCCCTGCAATGGACCGACGCCTACCAGGAAACCATGTACTGCTTCACC
Whan can I do to solve the problem?, or Should I change the code in the part, where Unus is using formatdb?
Finally, I used Unus with 4 Shigella genomes before, and it didn't have this problem.
Also, I know
formatdb
is kind of obsolete, so how should I usemakeblastdb
?