Question: Kegg And Network Visualization - Getting Started
gravatar for miz
7.4 years ago by
miz20 wrote:

Trying to get a hang of using KEGG data, and I'm having a hard time figuring out where to start. I have a list of metagenomic genes that have been annotated with KO numbers. There are couple things I would like to do with this data and haven't figured out how to yet:

  1. Visualize the full metabolic network: I know ipath does something like this, but I'm not sure how to turn my list of KOs into the correct format for ipath. Additionally, I would prefer something local.
  2. Get a list of nodes and edges within the network. This would be ideal for input into a network program like cytoscape or python's networkx. How do I get from KOs to nodes and edges?

I think these are pretty basic tasks for KEGG data, but I'm stuck. Any help/links to resources would be much appreciated.

pathway genome kegg • 2.8k views
ADD COMMENTlink written 7.4 years ago by miz20

One problem you'll encounter is that the KEGG ftp is not free. They changed it to a subscriber based system last year. To get at the node/edges, you'll need their pathway files which is located in their ftp. I had to resort to web scraping some of their data.

ADD REPLYlink written 7.4 years ago by Damian Kao15k

Did you use their api for that? Or is there some other database you would recommend?

ADD REPLYlink written 7.4 years ago by miz20

I web scraped it. So I literately just downloaded most of their html pages with curl and parsed the information with a script. Their web service is pretty amenable to this type of heavy-handed data mining. However, I do not recommend doing this as it is, self admittedly, kinda a dick move on my part. It's not really meant to be used that way and probably can cause a lot of unnecessary traffic load.

ADD REPLYlink modified 7.4 years ago • written 7.4 years ago by Damian Kao15k
gravatar for Josh Herr
7.4 years ago by
Josh Herr5.7k
University of Nebraska
Josh Herr5.7k wrote:

Like you, I am also working with metagenomic data and I was having a similar problem this last year getting KEGG data for my metagenomic reads. As Damian says the FTP for KEGG is not free, and it's quite expensive; it wasn't a matter of us shelling out some cash for it.

My solution was to use MG-RAST. You'll have to upload your data and run it through their pipeline, but in the analysis section after you upload your data you can download the KEGG information. I was then able to output the node and edge data and use the kgmlreader application from within Cytoscape. The downside to this is, depending on how much sequence data you have and depending on their server loads, it can take up to a week to run your data through the MG-RAST pipeline. MG-RAST is a web server so you won't be able to set up anything local, but then you can take your tabular output and run it into Cytoscape locally.

There may be another web service out there that will provide you with the node/edge data. I'm in agreement with Damian: I'm sure there are ways to find the node/edge data online, but you'll have to do some serious searching and/or data format manipulation.

ADD COMMENTlink modified 7.4 years ago • written 7.4 years ago by Josh Herr5.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1948 users visited in the last hour