A genome assembler which uses information on DNA methylation?
1
0
Entering edit mode
18 months ago
shelkmike ★ 1.2k

As far as I know, all assemblers that use Nanopore reads assemble using a four-letter alphabet, i.e. A, T, G, C. However, the Nanopore sequencing provides information on methylation. Therefore, it is possible to perform assembly in a five-letter space: A, T, G, C, mC. This may help to resolve repeats if one copy of a repeat has some cytosines methylated, while the other does not.

I understand that different cells in a given tissue have somewhat different methylation, but, anyway, their methylation is highly correlated, I guess. Is there some fundamental obstacle which prevents using information on DNA modifications in a de novo assembly?

assembly • 674 views
ADD COMMENT
0
Entering edit mode

Is there some fundamental obstacle which prevents using information on DNA modifications in a de novo assembly?

To simplify, both Nanopore and PacBio require the alignment of the raw sequencing data against the reference genome to identify methylated loci.

ADD REPLY
0
Entering edit mode

Not true, Guppy can output this info without a reference

ADD REPLY
0
Entering edit mode
18 months ago

Interesting idea.

You'd need a new fastq format which allows the methylation calls, or a parallel file denoting the mC. At least guppy can deliver the mC calls directly, without needing a reference genome to anchor the calls to.

However, I doubt it would be better than what's available. Even in plants, long reads have made assembly very tractable these days. The plant data I have seen also suggests methylation to be highly pervasive and conserved, especially in the repeats which are most difficult to resolve in assembly.

If more data show widespread differential methylation between haplotypes, then it might be interesting there, but more evidence or a PoC makes sense before working on this.

ADD COMMENT

Login before adding your answer.

Traffic: 2071 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6