**40**wrote:

I am trying to use VelvetOptimiser to do denovo assembly of a plant. I have 2 libraries; 100PE with insert size ~ 400bp and 150PE mate pair library with 12-18k insert size. I have several questions:

A) I tried to do initial assembly with only 1 library (the 100PE with insert size 440bp) like this:

VelvetOptimiser.pl -s 35 -e 47 -f '-fastq -shortPaired pe_merged -fastq -short s_se' -a -o '-unused_reads yes' -t 1

The results were good (Final graph has 731536 nodes and n50 of 5769, max 55415, total 149725954, using 51729078/69351035 reads). The VelvetOptimiser identified K=41 and set the Expected coverage to 10.

**First question:** VelvetOptimiser took 6 itrations to declare the Optimum value of coverage cutoff = 1.89, however it ran the final velvetg with (-cov_cutoff 1.44) !!!! Any one can explain this? should I re-run velvetg with the optimum coverage cutoff?

**Second question:** the Paired Library insert stats gave me 2 lines:

Paired-end library 1 has length: 438, sample standard deviation: 19

Paired-end library 1 has length: 441, sample standard deviation: 36

Why this is coming out as two different libraries ??

**Third question:** The final output has runs of "N's". According to Quast which is another software to assess the assembly, # N's per 100 kbp = 10271. I did not use the "-scaffolding" option and I do not see the VelvetOptimiser passing this to the velevtg. Can anyone explain why the velvet is generating this N's?

B) Then I tried to combine the 2 libraries like this:

VelvetOptimiser.pl -s 31 -e 47 -f '-fastq -shortPaired pe_merged -fastq -shortPaired2 mp_merged -fastq -short s_se_combined' -a -o '-unused_reads yes' -t 1

The results came out VERY bad. The expected coverage even went down to 9. Here is the results of the final itaration:

Velvet hash value: 47

Roadmap file size: 7263105695

Total number of contigs: 387128

n50: 811

length of longest contig: 17283

Total bases in contigs: 158410389

Number of contigs > 1k: 36064

Total bases in contigs > 1k: 68344726

Paired Library insert stats:

Paired-end library 1 has length: 437, sample standard deviation: 21

Paired-end library 2 has length: 211, sample standard deviation: 105

Paired-end library 1 has length: 440, sample standard deviation: 57

Paired-end library 2 has length: 220, sample standard deviation: 138

Paired-end library 1 has length: 440, sample standard deviation: 62

Paired-end library 2 has length: 221, sample standard deviation: 149

So my **Fourth question** is how to do the genome assembly **Using 2 Insert Size Libraries** ? Is it better to finish assembly from the 1st stage and use another software (SSPACE) for integration of mate pair sequences?

Thank you