Question

Very deep coverage data

1

Entering edit mode

9.0 years ago

giltae.song ▴ 10

I have data in 50,000 X coverage and paired-end 100 bp reads. The genome size is expected to be around 12.5 Mb. I would like to run ABySS for assembly and to see how much the assembly is improved comparing to data in 200 X coverage. Do you have any suggestions to run ABySS with this data? Is it doable to use regular ABySS paired-end mode for obtaining the assembly?

abyss assembly • 1.9k views

ADD COMMENT • link updated 14 months ago by Ram 43k • written 9.0 years ago by giltae.song ▴ 10

Ram · Answer 1 · 2015-05-04

1

Entering edit mode

9.0 years ago

Brian Bushnell 20k

I do not think you will get a better assembly; more likely, a worse assembly. Unless your coverage is very uneven, going over 100x or so typically starts to make the assembly worse, as there are an increasing number of exactly replicated sequencing errors, which create false branches in the deBruijn graph. With thousands of X coverage, people typically normalize or subsample in order to achieve a better assembly. Though it's possible that some metagenome, single-cell, or RNA-seq assemblers would be more tolerant of such high coverage.

ADD COMMENT • link updated 14 months ago by Ram 43k • written 9.0 years ago by Brian Bushnell 20k

0

Entering edit mode

In my opinion, this is good advice.

For what it's worth, ABySS automatically calculates its kmer coverage threshold for filtering out error kmers based on the kmer coverage histogram, so in principle your data set should assemble fine. But you are probably not going to gain much from having all that extra coverage.

ADD REPLY • link updated 14 months ago by Ram 43k • written 9.0 years ago by benv ▴ 730