Question

Error in malloc (SPAdes assembler)

0

Entering edit mode

3.3 years ago

asgiraldoc • 0

Hi all,

I have 3 paired-end libraries from Illumina sequencing (151bp). Each library has almost 15M of reads. However, when I run spades I'm running out of RAM when the assembly starts with kmers 55; the error looks like this: jemalloc: Error in malloc(): out of memory Requested: 8388608. So, the genome assembly could not be completed. It's a small genome (25Mb).

I'm working on a server with 1.5Tb of RAM. And this is my code:

spades.py --careful -k 55,77,99 -t 32 -m 1000 --pe1-1 work/mapeo2_F.fastq 
--pe1-2 work/mapeo2_R.fastq --pe1-1 work/mapeo1_F.fastq --pe1-2 work/mapeo1_R.fastq --pe1-
1 work/mapeo3_F.fastq --pe1-2 work/mapeo3_R.fastq -o work/Hcol

Do you have any idea of how to overcome that error?

Assembly genome sequencing software error • 889 views

ADD COMMENT • link updated 3.3 years ago by shelkmike ★ 1.2k • written 3.3 years ago by asgiraldoc • 0

0

Entering edit mode

You may have too much data for a small genome. Consider normalizing your sequence reads with bbnorm.sh before trying this assembly.

I'm working on a server with 1.5Tb of RAM

Are you the only user on this machine? If not, your program may have much less RAM available to work with.

ADD REPLY • link 3.3 years ago by GenoMax 141k

0

Entering edit mode

Not related to RAM consumption, but you run Spades in a wrong way. If you have three libraries, you should provide them with --pe1-1, --pe1-2, --pe2-1, --pe2-2, --pe3-1, --pe3-2. The number after "pe" is the number of the library.

ADD REPLY • link 3.3 years ago by shelkmike ★ 1.2k

0

Entering edit mode

You can also check how much memory is actaully free with free -h. You actaul free memory will be somwhere between the "free" column and the "available" column. In theory all the memory in the "available" column should be accessible to you, but in practice we've found that this sometimes isn't the case (e.g. we once had a case where a memory mapped file was being kept in memory after the termination of the program that used it, and wasn't being released when the OS asked for it).

ADD REPLY • link 3.3 years ago by i.sudbery 19k

score 0 · Answer 1 · 2020-12-22

Spades monitors RAM usage during its run. Can you look into its logs and find what was the RAM usage before Spades crashed?
Another method to find peak RAM consumption of a program is to run it with a command which starts from "/usr/bin/time -v", in your case the command will be:

/usr/bin/time -v spades.py --careful -k 55,77,99 -t 32 -m 1000 --pe1-1 work/mapeo2_F.fastq --pe1-2 work/mapeo2_R.fastq --pe1-1 work/mapeo1_F.fastq --pe1-2 work/mapeo1_R.fastq --pe1-1 work/mapeo3_F.fastq --pe1-2 work/mapeo3_R.fastq -o work/Hcol

and then look at the line "Maximum resident set size (kbytes):" of the output when Spades crashes.

In my experience, 1.5Tb RAM should definitely be enough to assemble a 25 megabase genome.