Question: How To Extract Go Terms From A Given Kegg Id
2
gravatar for Rm
6.8 years ago by
Rm7.8k
Danville, PA
Rm7.8k wrote:

How to use biomart to link KEGG pathway ID to GO terms?

biomart kegg go • 4.4k views
ADD COMMENTlink modified 7 weeks ago by Shaurya Jauhari40 • written 6.8 years ago by Rm7.8k
6
gravatar for Neilfws
6.8 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

I don't think this is possible using most web-based implementations of BioMart, since the underlying database does not contain KEGG identifiers.

The closest I can find to what you want is this file, mapping KEGG reaction IDs to GO terms.

ADD COMMENTlink written 6.8 years ago by Neilfws48k

Thanks @Neilfws: May be I need to link using genes gene->GO ; Gene->KEGG ; then extrapolate KEGG to GO

ADD REPLYlink written 6.8 years ago by Rm7.8k
2
gravatar for Joachim
6.8 years ago by
Joachim2.8k
San Francisco, California
Joachim2.8k wrote:

You can get GO terms that are linked to KEGG pathways via the KEGG API.

This Ruby script, go.rb, uses BioRuby to extract GO term(s):

require 'bio'

# Read in pathway ID from the command line:
pathway_id = ARGV[0]

# Connect to the public KEGG API server:
server = Bio::KEGG::API.new

# Retrieve a single pathway:
pathway_sheet = server.get_entries(["PATHWAY:#{pathway_id}"])

# Turn the textual representation into a Ruby object:
pathway = Bio::KEGG::PATHWAY::new(pathway_sheet)

# Check if there is a DB link to GO:
if pathway.dblinks.has_key?('GO') then
    # Print each GO term on a separate line:
    pathway.dblinks['GO'].each { |term|
        puts "GO:#{term}"
    }
end

You can use this script on the command line as follows:

$ ruby go.rb hsa04020
GO:0019722
$ ruby go.rb hsa04210
GO:0006915
...

This will give you the GO term(s) that are linked to pathway hsa04020.

Hope that helps.

UPDATE:

An R solution using KEGGSOAP of Bioconductor.

# For installing Bioconductor and the KEGGSOAP package, run:
# source("http://bioconductor.org/biocLite.R")
# biocLite("KEGGSOAP")

library(KEGGSOAP)

# Get the textual representation got the pathway:
# (For now, there is no function like get.genes.by.pathway for getting dblinks.)
pathway <- bget("PATHWAY:hsa04020")

# Split the very long textual description into individual lines:
pathway.lines <- unlist(strsplit(pathway, '\n'))

# Create an empty vector for storing GO terms of the pathway:
pathway.go.terms <- c()

# Create a variable that is set to TRUE when we are processing the DBLINKS section:
in.dblinks <- FALSE

# Go through the pathway description line-by-line:
for (n in 1:length(pathway.lines)) {
  # If we are in the DBLINKS section, figure out when we leave it again:
  if (in.dblinks == TRUE && !(substring(pathway.lines[n], 1, 1) == " "))
    in.dblinks <- FALSE

  # When we see the beginning of the DBLINKS section, jot this down:
  if (in.dblinks == FALSE && substring(pathway.lines[n], 1, 8) == "DBLINKS ")
    in.dblinks <- TRUE

  # If we are in the DBLINKS section, then look out for GO terms and save them:
  if (in.dblinks == TRUE && substring(substring(pathway.lines[n], 13), 1, 3) == "GO:")
    pathway.go.terms <- append(pathway.go.terms, substring(pathway.lines[n], 13))
}

# The GO terms of the pathway are now accumulated in the vector pathway.go.terms.
ADD COMMENTlink modified 6.8 years ago • written 6.8 years ago by Joachim2.8k

Thanks @Joachim: Any R alternative?

ADD REPLYlink written 6.8 years ago by Rm7.8k

Well, there is always: go <- system2("ruby", "go.rb hsa04020", stdout=TRUE)

ADD REPLYlink written 6.8 years ago by Joachim2.8k

I tried similarly as described here : http://www.r-bloggers.com/calling-ruby-perl-or-python-from-r/ : in windows I need to install Ruby and all....

ADD REPLYlink written 6.8 years ago by Rm7.8k
1

I updated my answer with an R solution. Big thanks to Neil for pointing out KEGGSOAP. Too bad that a get.dblinks.by.pathway function has not been implemented yet though.

ADD REPLYlink written 6.8 years ago by Joachim2.8k

@Joachim: +1 ; Thanks for the R update tooo. I also appreciate the your "commenting" the code step by step.

ADD REPLYlink modified 6.8 years ago • written 6.8 years ago by Rm7.8k

R/Bioconductor has multiple KEGG-related packages: http://bioconductor.org/help/search/index.html?q=kegg. KEGGSOAP may do what you want.

ADD REPLYlink written 6.8 years ago by Neilfws48k

Thanks @Neilfws: I will give it a try...

ADD REPLYlink written 6.8 years ago by Rm7.8k
1
gravatar for daveshire
17 months ago by
daveshire10
daveshire10 wrote:

I know this is a dead thread, but I wanted to do roughly the same thing as the first poster and found that KEGG's linkDB system works pretty well. It was easy to pull up a list of all KO : GO term matches and it looks like there are various other mappings that it can be used for but I haven't tried them all.

http://www.genome.jp/linkdb/

ADD COMMENTlink written 17 months ago by daveshire10
0
gravatar for Shaurya Jauhari
7 weeks ago by
China
Shaurya Jauhari40 wrote:

Via transitivity; GO <-> Orthology (KO terms), Orthology <-> PubmedID, PubmedID <-> Pathway; KEGG API/ LinkDB allows for structuring a many-many linkage map between GO and Pathway terms that isn't directly available (although marked 'routed' on the official page). This has to be an explicit effort.

P.S. Contrarily, I do argue the veracity of this metric. A GO ID is indicative of a gene, while KEGG ID that of a pathway. By doing the above, we are throwing away quite a lot of background information by representing a pathway merely by a gene.

ADD COMMENTlink modified 6 weeks ago • written 7 weeks ago by Shaurya Jauhari40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1188 users visited in the last hour