Something really bizarre is going on. I ran 6 tblastn jobs independently using the following command line entry:
tblastn -query blastqueries*14920.fasta -out blastqueries*14920.fasta.out -db refseq*genomic -outfmt 5 -soft*masking True -show_gis -seg yes &
after repeating this 6 times for all my input files, a quick look at the directory showed the output files had been created as expected, with a size of 0 KB.
in the top list, I saw all 6 of my tblastn jobs running happily at 100% cpu usage each.
after about a day, 2 of them are no longer in the list of active processes, and yet all 6 out files are still empty.
No error messages in case you were wondering.
...so any ideas on where the heck the results are? if there are any? aah!
UPDATE
I am running tblastn 2.2.25 against a local copy of the blast db refseq_genomic. computer is a macpro with 2 quad-core intel xenon processors, 16 GB of RAM.
The blast+ that I downloaded was the disk image version, don't know it tht means that it is 32 bit only??
Okay so I now know more about whats going on!
The rest of my runs also died, this time I was around to notice the full error:
set a breakpoint in malloc*error*break to debug
tblastn(32920) malloc: *** mmap(size=1496113152) failed (error code=12)
Someone else has posted the same problem - Problem Running Blast Jobs.
check with a baby fasta file and see if it generates the outputs...
your disk might be full! happened to me...
a bus error is a memory error, I've seen that happen when running an executable on a platform that was not designed for: 64bit application on a 32 bit platform
yea it does... i didnt that a million times before running my big job. i should say, each of my runs is querying over 8000 sequences...
yea it does... i did that a million times before running my big job. i should say, each of my runs is querying over 8000 sequences...
nope - disk is not full, but a few minutes ago the terminal spat out a bus error not sure if that means that blast died
Thanks Istvan but thats not it either..
1 up to all of you for giving it a go though!
So you've split your massive job into six sections that take a day (or more) to run. My suggestion is to split it into many more sections, each of which is smaller (on the same scale as what you tested). BLAST parallelizes beautifully on the level of input sequences, so there's no real reason to some massive job all at once. A simple shell script can immediately queue up the next job when one job finishes...
So you've split your massive job into six sections that take a day (or more) to run. My suggestion is to split it into many more sections, each of which is smaller (on the same scale as what you tested). BLAST parallelizes beautifully on the level of input sequences, so there's no real reason to submit a massive job all at once. A simple shell script can immediately queue up the next job when one job finishes...
But 8000 query seqs is not that much. I ran over 80,000 seqs in one file against NR using blastx recently, took a few days (only one segfault on the way ...) but worked after resuming it there.