Question: Blastn - Program Runs Indefinitely When Generating Xml Formatted Output
2
gravatar for User 6228
8.2 years ago by
User 6228110
User 6228110 wrote:

I am running blastn on some nucleotide data, and it seems to run indefinitely when I generate XML output. The jobs take ~15 minutes when generating either the default format or tab delimited, but when I choose XML format each job maxes out the 3 hour cap I have set on it. I find it hard to believe XML generation would increase the job length sixfold so I figure there is a problem somewhere. Has anyone run into this?

I am using BLAST 2.2.24+.

Thanks

EDIT: Here are some example commands:

Working (archive format):
/uaopt/ncbi/2.2.24+/bin/blastn -outfmt 11 -db beij \
-query $home'datasets/main/KCmeta.fna' \
-out '/scr3/bmf/results/reference
alignment/blast/beij_archive.asn'

Not working (xml format):
/uaopt/ncbi/2.2.24+/bin/blastn -outfmt 5 -db beij \
-query $home'datasets/main/KCmeta.fna' \
-out '/scr3/bmf/results/reference
alignment/blast/beij.xml'

Also working are the default format, tab delimited, tab delimited w/ comments, & CSV.

I realized I have access to 2.2.24+ but the results are the same, I'd prefer not to need 2.2.25 since this is in a high performance computing lab where I have to request that it be installed.

nucleotide blast • 2.1k views
ADD COMMENTlink modified 2.9 years ago by Biostar ♦♦ 20 • written 8.2 years ago by User 6228110
3

Please post one exact command that works and one that does not.

ADD REPLYlink written 8.2 years ago by Michael Schubert6.9k

Just a thought: you're not running out of disk space?

ADD REPLYlink written 8.2 years ago by Neilfws48k

Using the XML format usually generates a lot of data, how many sequences are your running against which database? Like Michael asked, please show us the parameters you've used.

ADD REPLYlink written 8.2 years ago by Fucitol120

Please try again with the latest version 2.2.25.

ADD REPLYlink written 8.2 years ago by Michael Dondrup46k
1
gravatar for Hamish
7.7 years ago by
Hamish3.1k
UK
Hamish3.1k wrote:

Well I can't replicate the problem so this is going to be a bit of a stab in the dark...

When I've had problems with the generation of the NCBI BLAST XML output in the past the problem has been an issue with the database. Some things for you to check:

  1. That the sequence identifiers are unique in the database.
  2. The BLAST database was created with the identifiers indexed, i.e. for fasta sequence format input use formatdb with '-oT' or makeblastdb with '-parse_seqids'. Note: BLAST uses case insensitive indexing for the identifiers so be careful to catch identifiers that vary only in case in step 1.

Beyond that it sounds like you might have found a bug, so contacting the NCBI BLAST help-desk may be the only way forward (see BLAST help). If you do find an explanation for this behavior be sure to post an answer, so we all know what to look for in future.

ADD COMMENTlink written 7.7 years ago by Hamish3.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 800 users visited in the last hour