Very deep coverage data
1
1
Entering edit mode
9.0 years ago
giltae.song ▴ 10

I have data in 50,000 X coverage and paired-end 100 bp reads. The genome size is expected to be around 12.5 Mb. I would like to run ABySS for assembly and to see how much the assembly is improved comparing to data in 200 X coverage. Do you have any suggestions to run ABySS with this data? Is it doable to use regular ABySS paired-end mode for obtaining the assembly?

abyss assembly • 1.9k views
ADD COMMENT
1
Entering edit mode
9.0 years ago

I do not think you will get a better assembly; more likely, a worse assembly. Unless your coverage is very uneven, going over 100x or so typically starts to make the assembly worse, as there are an increasing number of exactly replicated sequencing errors, which create false branches in the deBruijn graph. With thousands of X coverage, people typically normalize or subsample in order to achieve a better assembly. Though it's possible that some metagenome, single-cell, or RNA-seq assemblers would be more tolerant of such high coverage.

ADD COMMENT
0
Entering edit mode

In my opinion, this is good advice.

For what it's worth, ABySS automatically calculates its kmer coverage threshold for filtering out error kmers based on the kmer coverage histogram, so in principle your data set should assemble fine. But you are probably not going to gain much from having all that extra coverage.

ADD REPLY

Login before adding your answer.

Traffic: 2564 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6