How To Set -Max_Coverage In Velvetg
Entering edit mode
11.1 years ago
rwn ▴ 590


I am assembling bacterial genomes (~6Mb) using 250 bp paired-end MiSeq data. I have tried a bunch of assemblers (idba_ud, mira, ray, SOAPdenovo, ABySS to name a few...), but am getting reasonably good results using good old velvet (~360 contigs, n50 = 40kb). But I have a question about how to set the velvetg parameter -max_coverage? It's value has a large effect on the resulting number of contigs and total number of bases in the assembly (ie assembled genome size). Am I correct in thinking that many of these high-coverage nodes errors (or at least error-prone, like repeat elements etc) and should be excluded for a better assembly?

I estimate the coverage distribution (in R using plotrix) from the stats.txt file after running a preliminary: velvetg velvet_big_127 -cov_cutoff auto -exp_cov auto. It is then easy to calculate the weighted mean coverage -exp_cutoff and to set a reasonable value for -cov_cutoff, but there is often a long tail in the distribution meaning that there are small number of nodes with very high coverage.

Generally, what is a good way to determine a sensible value for -max_coverage?

Many thanks! Reuben


velvet genomics • 3.0k views
Entering edit mode
10.7 years ago
Torst ▴ 980

You shouldn't usually set the -max_coverage to anything, unless you know there is "contamination" in your sample in a higher ratio to what your actual true sample is you are trying to recover. Then you would use it as a "low pass filter". Another scenario is if you have a plasmid with high copy number relative to your chromosome. You could use -max_coverage to filter out the plasmid reads. Then you could use -cov_cutoff etc to do the opposite to recover the plasmid. But in general, setting -max_coverage will remove repeat elements from your assembly only. Although this may increase your metrics like N and N50, it is artificial, as what is left over is the same as what was there before, but without the repeated contigs.


Login before adding your answer.

Traffic: 1429 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6