I know a node can refer to the orignal kmer and its forward-reverse kmer, but how to deal with the problem that these short reads can also be in different orientation to the reference. For example, given genome reference AAACCT, should ACCT(TGGA)(forward) and TCCA(AGGT)(backward) considered also a same node in the de-Bruijn graph? Or just divided them into two seperated node?
Generally one uses the "canonical k-mer" when making de Bruijn graphs. This is typically whichever of the k-mers comes first in the alphabet (or numerically first if you're representing them as numbers). So in your example
ACCTTGGA would be stored. You'll have to account for this when traversing the graph, of course.