I'm trying to assemble partial genomes from a metagenome with approximately 15 million paired end reads, which look like:
>SN7001163:100:D237CACXX:2:1108:5829:37887/1 N:0:AGGCAGA CCCTCATCCGGCACGCCATCGAGAAGGGGGCGCACTGGATGACGCTCGAGGCGCGCGTCTCCAACACCGTGGCGCACGCGCTGTACCGCAAGTACACCTTC >SN7001163:100:D237CACXX:2:1108:13068:98533/2 N:0:AGGCAGA GTCCCGCCCAGCCGGGCCGCCGAGCGCGATCCCCTGGACGTCGTCGCGACGATGAGCGTCTTCGCGGAGTTCGCCAGGGCGGTGGGCGGCGACCTCGTCTC
I was following the tutorial on the Meta-velvet website, and compiled Velvet and Metavelvet using:
make 'MAXKMERLENGTH=101' 'OPENMP=1' 'LONGSEQUENCES=1'
I then ran Velveth:
velveth velvet/ 61 -fasta -shortPaired myreads.fasta
With a k-mer length of 61.
This seemed to run correctly, and created the necessary files. I then ran Velvetg:
velvetg velvet/ -exp_cov auto -ins_length 260
(I'm not sure exactly about my insert length - this was an Illumnia HiSeq run?), which also seemed to work, and produced contigs.fa and Graph2. The final output of this run was:
[188.416726] Estimated Coverage = 3.862745 [188.416758] Estimated Coverage cutoff = 1.931373 Final graph has 78220 nodes and n50 of 360, max 32930, total 22293747, using 3764027/13750351 reads
However, I then ran meta-velvetg ./velvet/ | tee logfile, and the results were:
[meta-velvetg] Check command line options... OK. Your command line options seem to be good. [meta-velvetg] Load meta-graph ... [0.000001] Reading read set file ./velvet//Sequences; [2.006028] 13750351 sequences found [14.165548] Done [29.950580] Reading graph file ./velvet//Graph2 [29.950634] Graph has 102502 nodes and 13750351 sequences [meta-velvetg] Category = 'short1' Ave. = -1, SD = -1 [meta-velvetg] Category = 'short2' Ave. = -1, SD = -1 [meta-velvetg] Category = 'long' Ave. = -1, SD = -1 [meta-velveth] ...done (load meta-graph). [meta-velvetg] Estimate coverage parameters... [mate-velvetg] Estimate expected coverage ... yes. Expected coverage = 3.86275 [mate-velvetg] Estimate expected coverages ... yes. [32.550552] Writing into stats file ./velvet//meta-velvetg.Graph2-stats.txt... [MetaHisto] First valley = 3 [MetaHisto] Largest peak coverage = 3 (frequency count = 3.2506e+06) [MetaHisto] Noise cutoff coverage = 325060 [MetaHisto] Find 1-th coverage peak: 10 (frequency count = 2.2049e+06) [meta-velvetg] Warning: Can't find multiple coverage peaks. [meta-velvetg] Trun on single coverage peak mode. [MetaGraph] 1-th coverage peak = 3.86275 [meta-velvetg] Estimate coverage cutoff ... yes. Coverage cutoff = 1.93137 [meta-velvetg] ...done (estimate coverage parameters). [meta-velvetg] Remove low & high coverage nodes ... [meta-velvetg] Min. coverage cutoff for short reads = 1.93137 [meta-velvetg] Min. coverage cutoff for long reads = -1 [meta-velvetg] Max. coverage cutoff for short & long reads = -1 [meta-velvetg] Min. contig length = -1 [VelvetGraph] === Remove low coverage nodes === [32.788521] Removing contigs with coverage < 1.931373... [32.804994] Concatenation... [32.837935] Renumbering nodes [32.837940] Initial node count 102502 [32.840015] Removed 23989 null nodes [32.840018] Concatenation over! [32.841552] Concatenation... [32.843933] Renumbering nodes [32.843936] Initial node count 78513 [32.844192] Removed 0 null nodes [32.844194] Concatenation over! [VelvetGraph] === Remove high coverage nodes === [VelvetGraph] === Clip tips hardly === [32.844225] Clipping short tips off graph, drastic [32.857123] Concatenation... [32.859803] Renumbering nodes [32.859805] Initial node count 78513 [32.861166] Removed 126 null nodes [32.861169] Concatenation over! [32.861170] 78387 nodes left [meta-velvetg] ...done (remove low & high coverage nodes). [meta-velvetg] Scaffolding based on paired-end information ... [MetaGraph] === Scaffolding with single peak mode === [VelvetGraph] === Rock Bank === [32.861196] Read coherency... [32.862602] Identifying unique nodes [32.864256] Done, 61600 unique nodes counted [32.864258] Trimming read tips [32.866858] Renumbering nodes [32.866860] Initial node count 78387 [32.867108] Removed 0 null nodes [32.867110] Confronted to 0 multiple hits and 0 null over 0 [32.867111] Read coherency over! [VelvetGraph] === Create read paring array === [VelvetGraph] === Detach dubious reads === [VelvetGraph] === Activate gap markers === [VelvetGraph] === Scaffolding ===
Meta-velvet created the two stats files, but not meta-velvetg.contigs.fa! I think it may have crashed at the Scaffolding stage. I'm running this analysis on an ubuntu machine with 60 GB of RAM. Can anyone provide any help as to what may be going wrong?