Question: Makeblastdb Error
3
gravatar for Daniel Standage
9.0 years ago by
Daniel Standage3.9k
Davis, California, USA
Daniel Standage3.9k wrote:

I ran into the following error when trying to build a database using makeblastdb (NCBI BLAST 2.2.23+).

> makeblastdb -in uniprot90.faa -dbtype prot -parse_seqids

Building a new DB, current time: 08/30/2010 12:00:11
New DB name:   uniprot90.faa
New DB title:  uniprot90.faa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1073741824B
Error: invalid string size parameter in function: basic_string::__getRep(size_t,size_t)
 size: -2 is greater than maximum size: -51

I ran the corresponding command with the older version (formatdb) and got no errors, so I'm assuming it's not an issue with the sequence data. Does anyone know what might have caused this problem?

makeblastdb blast error • 4.4k views
ADD COMMENTlink written 9.0 years ago by Daniel Standage3.9k
2

Just saw the latest update is 2.2.24 now ftp://ftp.ncbi.nih.gov/blast/executables/blast+/2.2.24/, can you try with that? Otherwise a reproducible example is required, could you try with the first few entries of your input, or just put the fasta file online? Some more wild guesses: try without -parse_seqids, it could be some eg non utf-8 chars in the fasta headers. Otherwise the ncbi will need a reproducible example anyway.

ADD REPLYlink written 9.0 years ago by Michael Dondrup46k

This must be a bug with 2.2.23, because there were no issues with 2.2.24. Thanks!

ADD REPLYlink written 9.0 years ago by Daniel Standage3.9k
3
gravatar for Daniel Standage
9.0 years ago by
Daniel Standage3.9k
Davis, California, USA
Daniel Standage3.9k wrote:

Thanks for the comments. We was able to figure out how to sidestep the issue with BLAST 2.2.23+. The sequence IDs followed this format.

>sp|Q197F5|005L_IIV3

Playing around with this format didn't help until we added a comment to the end of the ID, like so.

>sp|Q197F5|005L_IIV3 my sequence

I don't know how useful this finding is, though, since this problem seems to have been fixed with version 2.2.24+.

Edit: NCBI referred us here. This file provides the defline formats that will produce consistent results.

ADD COMMENTlink modified 19 days ago by RamRS24k • written 9.0 years ago by Daniel Standage3.9k
2
gravatar for brentp
9.0 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

I'm guessing you're on a 32 bit system and your file size is larger than the maximum indicated in your paste (so you get overflow with size_t). Try using a smaller db, or a 64bit system.

ADD COMMENTlink written 9.0 years ago by brentp23k

I thought that might be the problem as well. I am on a 64 bit system and I tried both making the DB smaller and allowing a bigger DB, neither of which helped.

ADD REPLYlink written 9.0 years ago by Daniel Standage3.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 664 users visited in the last hour