Question: Tips for increasing metaSPAdes assembly speed for 7 billion reads?
2
gravatar for anin.gregory
2.8 years ago by
anin.gregory90
United States
anin.gregory90 wrote:

I have 7 billion paired-end reads from multiple microbiome studies that I want to run a cross-assembly across using metaSPAdes.

Background:

  • I need to use metaSPAdes
  • I have access to a 1.5TB memory node, where it can run almost indefinitely, but I have a deadline of October for the assembly to be done
  • All reads have been error-corrected using bbnorm.sh
  • I have started a cross-assembly on the 1.5TB node using the '--only-assembler' flag that has been running for 3 weeks

The current assembly has been running and for the last 1.5 weeks it has been stuck on the 'post-simplification step' of 'Running Disconnecting edges with relatively low coverage'. I have looked online to see if this is a slow step for others on the SPAdes website and different forums, but could not find any discussions about this. Have you had this problem for anyone else? Does anyone have any tips to speed up the assembly?

Thanks!

ADD COMMENTlink written 2.8 years ago by anin.gregory90
1

My recommendation would be to use Megahit in this case; it is much less resource-intensive than SPades.

If you download the latest version of BBMap, there is now a file at:

bbmap/pipelines/assemblyPipeline.sh

That shows my suggested method of preprocessing data prior to assembly. It includes various trimming, filtering, and error-correction operations to minimize the number of erroneous kmers than increase time and memory consumption of large metagenomes, so it may be helpful in this case.

ADD REPLYlink written 2.8 years ago by Brian Bushnell17k

All reads have been error-corrected using bbnorm.sh

Did you also normalize the reads?

ADD REPLYlink written 2.8 years ago by st.ph.n2.5k

No, some benchmarking we did in our lab has shown that normalization reduces our contig lengths because SPAdes as it uses differential coverage to resolve ambiguities.

ADD REPLYlink written 2.8 years ago by anin.gregory90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1002 users visited in the last hour