Blast run fails: memory exceeds the limit
1
0
Entering edit mode
18 months ago
langziv ▴ 20

Hi.

When tracking the job's progress I saw that the parameter resources_used.mem keeps growing, until it reaches the maximal memory defined in the script, and then the run fails. I tried increasing the memory, but the run eventually fails as the resources_used.mem keeps growing. I also noticed that the ouput file created by the program remains 0 KB. Usually the ouput files grow bigger throughout the program's run, but in this job it remains 0 KB. Does anybody know what's wrong with the command's parameters?

The script:

#!/bin/bash
#PBS -q ...
#PBS -N ...
#PBS -e ...
#PBS -o ...
#PBS -l nodes=1:ppn=20,mem=60gb

cd /some/path/

blastx -query A1/scaffold.fa \
-db /root/BLAST/Proteins2/nr \
-max_hsps 1 -max_target_seqs 1 -num_threads 20 \
-out just_trying.txt \
-outfmt "6 std staxids qseqid sseqid staxids sscinames scomnames stitle"


The error message:

Warning: [blastx] Examining 5 or more matches is recommended =>> PBS: job killed: mem 63195476kb exceeded limit 60000000kb

bash blast blastx • 818 views
4
Entering edit mode
18 months ago
Mensur Dlakic ★ 18k

This error message is as clear as it gets: 60Gb is not enough to run BLAST on a non-redundant database. This may be possible with lots of swapping on a computer that doesn't have more than 60Gb RAM, but on a cluster with larger memory BLAST will keep going and eventually blow through the limit you set. Two solutions I can think of: 1) assign more than 60Gb memory - I would go for the max memory your cluster has available; 2) use a database that is filtered out at 90% redundancy, as that will lower its size to only 30-40% of nr without actually affecting the search in any significant way unless you absolutely have to have all sequence matches. I use UniRef90 for this purpose.

0
Entering edit mode

Thanks Mensur Dlakic. That means that, unlike any tool I used so far, when running blast the ouput file is written at the end of the run? Is that your experience? Up to now I never had an issue with memory limit, so I thought that maybe there's something wrong in the command I wrote.

0
Entering edit mode

Not sure why you are so interested about the issue of file writing when you have a different kind of problem to solve, but here goes the obscure explanation. BLAST starts writing immediately to the output (screen or file, whatever the case may be), but the initial output is only three lines until the the database is read in and first search results are available. Those 3 lines are:

BLASTP 2.10.1+
<empty line>
<empty line>


Since files are written in chunks that correspond to sector sizes and the output above is smaller than the sector size, nothing will be written to the disk until the combined output is greater than the sector size. In your case it most likely never gets to that point because the program blows through the memory limit before it gets to write additional lines.

A less nerdy and shorter explanation is that BLAST writes its output as it goes, but the memory fault kicks in before the program can write anything.

0
Entering edit mode

Setting a higher memory limit kept the job running. Now it seems that the job will keep running forever. The output file that was created at the beginning of the run is still empty.