memory allocation in metaSpades
1
0
Entering edit mode
3.5 years ago
Maorkn • 0

hello all, I'm trying to assemble a meta Genome using metaSpades. my assembly keeps failing due to insufficient memory. I'm trying to use 30 cores with 500GB RAM each and was wondering should I set the -m flag to 500 or to 30*500, i.e 15,000? this is an example of the command used:

spades.py -k 21,33,55,77 -m 500 --threads 30 --pe1-1 TS_R1_val_1.fq.gz --pe1-2 TS_R2_val_2.fq.gz --meta -o spades_output

or should it be:

spades.py -k 21,33,55,77 -m 15000 --threads 30 --pe1-1 TS_R1_val_1.fq.gz --pe1-2 TS_R2_val_2.fq.gz --meta -o spades_output

and a follow-up question, when the spades manual says it may require 700-800 GB RAM does it mean in total? or per thread?

spades metagenomics RAM Assembly • 2.2k views
ADD COMMENT
0
Entering edit mode

Memory requirements should be total unless specified with some unit (e.g. 1GB per million reads, etc.). Memory available (in practice) is going to be determined by hardware you have access to i.e. in this case 500G is shared by the 30 cores.

What is the size of the input data files (in number of reads X cycles of sequencing)?

ADD REPLY
0
Entering edit mode

thank you! adjusted for your answer and it is now working

80 million reads of soil samples

ADD REPLY
1
Entering edit mode
3.5 years ago
Mensur Dlakic ★ 27k

It is a total memory requirement. I am going out there on the limb to guess that there are fewer than 50 computers in the world that have 30*700 GB RAM, and I strongly suspect that most of us don't have access to them.

If the memory prevents you from assembling the dataset, you may want to give megahit a try.

ADD COMMENT

Login before adding your answer.

Traffic: 1596 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6