Question

Help interpreting KEGG module definitions for converting to NetworkX graph

0

Entering edit mode

10 weeks ago

O.rka ▴ 740

I'm trying to convert KEGG module definitions to NetworkX DiGraph objects which means I need to learn how to parse the KEGG definitions. I'm starting with glycolysis since it is module 1 (M00001): https://www.genome.jp/module/M00001

Here is the definition:

(K00844,K12407,K00845,K25026,K00886,K08074,K00918) (K01810,K06859,K13810,K15916)
(K00850,K16370,K21071,K00918) (K01623,K01624,K11645,K16305,K16306) K01803 
((K00134,K00150) K00927,K11389) (K01834,K15633,K15634,K15635) (K01689,K27394) 
(K00873,K12406)

Here are the reactions:

R01786,R02189,R09085 C00267 -> C00668
R13199 C00668 -> C00085
R00756,R05805 C00085 -> C00354
R01068 C00354 -> C00111 + C00118
R01015 C00111 -> C00118
R01061,R01063 C00118 -> C00236
R01512 C00236 -> C00197
R07159 C00118 -> C00197
R01518 C00197 -> C00631
R00658 C00631 -> C00074
R00200 C00074 -> C00022

I've defined each node like this for NetworkX:

graph = nx.DiGraph()

node_a = frozenset({'K01803'})
node_b = frozenset({'K00134', 'K00150', 'K11389'})
node_c = frozenset({'K00927','K11389'})
node_d = frozenset({'K01834', 'K15633', 'K15634', 'K15635'})

How would I handle this potential bifurcation if I were to recreate the graph?

Method A:

graph.add_edge(node_a, node_b)
graph.add_edge(node_a, node_c)
graph.add_edge(node_b, node_c)
graph.add_edge(node_c, node_d)

Method B:

graph.add_edge(node_a, node_b)
graph.add_edge(node_b, node_c)
graph.add_edge(node_c, node_d)

The module graph on KEGG (not sure what they call this) looks like this: enter image description here

The compound graph on KEGG (again not sure the actual name) looks like this where the area in question is located: enter image description here

definition kegg module database genomics • 199 views

ADD COMMENT • link 10 weeks ago by O.rka ▴ 740