Question: De Bruijn Graph in genome assembly
0
3.4 years ago by
andynkili10
andynkili10 wrote:

i'm not sure i fully understood De Bruin graph when it comes to genome assembly issue.

i know that assembly by De Bruin graphs is based on kmers instead of overlapping sequences. But when the graph is done, finding the eulerian path inside (i.e the contig) is related to the fact that reads overlap, isn't it? i mean would it be possible to assemble non overlapping reads with a De Bruin graph?

de bruijn graph assembly • 1.7k views
modified 3.4 years ago by 13en80 • written 3.4 years ago by andynkili10
0
3.4 years ago by
13en80
United Kingdom
13en80 wrote:

Have you tried looking at the bioinformatics algorithms courses on Coursera? It seems pretty good if you want to understand graph based assembly. They've split it up a bit since I did it, but this one https://www.coursera.org/course/assembly covers sequence assembly using De Bruijn graphs, including lectures and problem sets where you build your own basic assembler. They also address some of the more complex issues around repeats, errors, etc. and how they complicate the graph structures, but they don't expect you to do that yourself!

I'm not quite sure what you mean about assembling non-overlapping reads though, how would you do that at all? With or without graph based methods, without an overlap there's no information about what order your reads should be in and assembly would be basically impossible. Unless you're talking about aligning against a reference?

thanks for the link. I totally agree with you about "With or without graph based methods, without an overlap there's no information about what order your reads should be in and assembly would be basically impossible", but i got a discussion with a De Bruijn (DB) graph assembly tool developer and we didn't understood each other about overlaping reads in the context of DB graph, but whatever. I just wanted to be sure assembly was impossible without any overlap information.

They way I see it, there's no overlap in De Bruijn graphs, just information about shared k-mers (although I suppose you could think of this as overlap). Remember, the reads are split into k-mers for De Bruijn graphs. The graph is based on those k-mers, not the reads.