hello all, I'm trying to assemble a meta Genome using metaSpades. my assembly keeps failing due to insufficient memory. I'm trying to use 30 cores with 500GB RAM each and was wondering should I set the -m flag to 500 or to 30*500, i.e 15,000? this is an example of the command used:
spades.py -k 21,33,55,77 -m 500 --threads 30 --pe1-1 TS_R1_val_1.fq.gz --pe1-2 TS_R2_val_2.fq.gz --meta -o spades_output
or should it be:
spades.py -k 21,33,55,77 -m 15000 --threads 30 --pe1-1 TS_R1_val_1.fq.gz --pe1-2 TS_R2_val_2.fq.gz --meta -o spades_output
and a follow-up question, when the spades manual says it may require 700-800 GB RAM does it mean in total? or per thread?
Memory requirements should be total unless specified with some unit (e.g. 1GB per million reads, etc.). Memory available (in practice) is going to be determined by hardware you have access to i.e. in this case 500G is shared by the 30 cores.
What is the size of the input data files (in number of reads X cycles of sequencing)?
thank you! adjusted for your answer and it is now working
80 million reads of soil samples