Question: The orientation of short reads on building de-Bruijn Graph
0
gravatar for 934963534
11 months ago by
93496353410
93496353410 wrote:

I know a node can refer to the orignal kmer and its forward-reverse kmer, but how to deal with the problem that these short reads can also be in different orientation to the reference. For example, given genome reference AAACCT, should ACCT(TGGA)(forward) and TCCA(AGGT)(backward) considered also a same node in the de-Bruijn graph? Or just divided them into two seperated node?

ADD COMMENTlink modified 11 months ago by Devon Ryan88k • written 11 months ago by 93496353410
1
gravatar for Devon Ryan
11 months ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

Generally one uses the "canonical k-mer" when making de Bruijn graphs. This is typically whichever of the k-mers comes first in the alphabet (or numerically first if you're representing them as numbers). So in your example ACCTTGGA would be stored. You'll have to account for this when traversing the graph, of course.

ADD COMMENTlink written 11 months ago by Devon Ryan88k

So that's to say when traversing the graph, a node actually represents four conditions(forward(ACCT), backward(TCCA), forward-reverse(TGGA), backward-reverse(AGGT))? Would it cause more branches?

ADD REPLYlink written 11 months ago by 93496353410

I assumed you had 8-mers. A node never represents its reverse, it's either the sequence or its reverse complement.

ADD REPLYlink written 11 months ago by Devon Ryan88k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2136 users visited in the last hour