Question: Very slow SoapDenovo2 assembly
gravatar for jamesT
4.3 years ago by
United States
jamesT30 wrote:

I am assembling a bacterial genome roughly ~7 Mbp in size from approximately 20 million 101 BP paired end reads, which should give me excellent coverage. Velvet completes this assembly and gives an okay n50 in approximately 5 minutes, but SoapDenovo2 has been running on the file for 6+ hours without even getting past the pregraph step. The same thing happened for both the 63mer and the 127mer programs. The output says it's  on something like the 10 billionth read, which doesn't seem to make any sense.  The server has plenty of RAM (120 GB) and 8 cores, and SoapDenovo2 is barely using any of that RAM, so that's clearly not the issue. The command I'm currently running is:

all -s /data/config -K 63 -R -F -o graph_prefix 1>ass.log 2>ass.err

and the config file is:

#maximal read length
#average insert size
#if sequence needs to be reversed
#in which part(s) the reads are used
#use only first 100 bps of each read
#in which order the reads are used while scaffolding
# cutoff of pair number for a reliable connection (at least 3 for short insert size)
#minimum aligned length to contigs for a reliable read location (at least 32 for short insert size)
#fastq file for single reads


Where assembly.fastq is an interleaved paired end reads file. Does anyone know what I might be doing wrong to get such a long assembly time?

assembly genome • 1.9k views
ADD COMMENTlink modified 4.2 years ago by Biostar ♦♦ 20 • written 4.3 years ago by jamesT30

I'm not sure that this will be the solution to your problem but, for SE reads, you should use "q" instead of "p". In your case:


ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by iraun3.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1220 users visited in the last hour