I have 7 billion paired-end reads from multiple microbiome studies that I want to run a cross-assembly across using metaSPAdes.
- I need to use metaSPAdes
- I have access to a 1.5TB memory node, where it can run almost indefinitely, but I have a deadline of October for the assembly to be done
- All reads have been error-corrected using bbnorm.sh
- I have started a cross-assembly on the 1.5TB node using the '--only-assembler' flag that has been running for 3 weeks
The current assembly has been running and for the last 1.5 weeks it has been stuck on the 'post-simplification step' of 'Running Disconnecting edges with relatively low coverage'. I have looked online to see if this is a slow step for others on the SPAdes website and different forums, but could not find any discussions about this. Have you had this problem for anyone else? Does anyone have any tips to speed up the assembly?