Hey all,
So I'm downloading XML files from Kegg using BioConductor and and I'm running into a problem. If we look at HIF-1 signaling pathway we see the node growth factor (GF) on the left hand side, we can trace it's XML entry to this entry:
<entry id="17" name="hsa:1950 hsa:3479 hsa:3630" type="gene"
link="https://www.kegg.jp/dbget-bin/www_bget?hsa:1950+hsa:3479+hsa:3630">
<graphics name="EGF, HOMG4, URG..." fgcolor="#000000" bgcolor="#BFFFBF"
type="rectangle" x="87" y="272" width="46" height="17"/>
</entry>
While there is no GF on this, there are the three info card links that I can use to extract information. However, when I click on that link, there isn't a place on those info cards that says that these three genes are related to that general title of GF.
Normally this isn't a problem, I would just look at the map, but I'm trying to do something at scale where I can't look at every instance of this.
So my question is, how do I find the labels that Kegg uses in its map using the XML file/python package?