makeblastdb.exe giving me an unclear error message when creating a database
1
1
Entering edit mode
5.3 years ago
DNAngel ▴ 240

I have a large sequence file that I want to convert into a database where I can blast other sequences against it. I've done this many times before with smaller file sizes, however this one is giving me an unclear error message:

Building a new DB, current time: 10/23/2017 14:17:46
New DB name:   ~\blast\db\mydatabase
New DB title:  ~\blast\myseqs.fa
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B

volume: ~\blast\db\mydatabase

file: ~\blast\db\mydatabase.nin
file: ~\blast\db\mydatabase.nhr
file: ~\blast\db\mydatabase.nsq

BLAST Database creation error: Need to write conversion for data type [0].


Note: I do not have missing residues (no empty lines), my sequences do have gaps with "-" representing gaps. I thought maybe that was the problem, but when I take say the first 10 sequences (keeping the gaps) from the same file, it converts easily into a database. So I thought maybe it was the size of the file (it is about 47400kb) so I broke the file up into 3 smaller files. Only the second file out of the three converted successfully into a nucleotide database, but the other 2 did not (note: they were all the same size and nothing was different about the sequences).

Here is the very simple command I used and have always used before with no issue:

makeblastdb.exe -in myseqs.fa -dbtype nucl -out mydatabase


I've contacted the support group for standalone blast on NCBI, but they have not responded at all to me, nor could I find any other instances of that error message on Google. I'm stumped.

blast • 1.9k views
0
Entering edit mode

You are using a single - to represent gaps of any length, correct?

0
Entering edit mode

each '-' represents 1 gap in the sequence, so one hypen = one base.

0
Entering edit mode
5.3 years ago

this message seems to be generated when your DNA is not:

check your dna sequence, search for strange characters in the fasta. E.g:

 grep -v '^>' input.fa | grep -o . | sort | uniq -c

0
Entering edit mode

I tried your suggestion for checking weird characters but I keep getting another error. Perhaps this is where the issue is? Although I don't understand the error (I am not great with grep/linux commands).

It says:

Input record exceeds maximum length. Specify larger maximum.

grep: write error: Illegal seek grep: write error: Invalid or incomplete multibyte or wide character

0
Entering edit mode

what is the output of

file input.fa


must be something like 'ASCII text'

0
Entering edit mode

It says: input.fa: ASCII text, with very long lines

0
Entering edit mode

This sounds like an issue related to sort on windows. Do you have access to a unix machine? Otherwise you could try wrapping the long fasta lines.