Question: Get pathway information from KEGG using Entry id
2
gravatar for sinumolgeorge
18 months ago by
sinumolgeorge10 wrote:

Hi, I have a list of ids from KEGG, is there any way to get the corresponding pathway information in a single search?

Kegg_ids are - stm:STM2395, sau:SA0132, efa:EF2297

Expected output stm:STM2395 : Pathway : Cationic antimicrobial peptide (CAMP) resistance

Thanks in advance

sequencing snp next-gen R gene • 1.4k views
ADD COMMENTlink modified 18 months ago by EagleEye6.2k • written 18 months ago by sinumolgeorge10
2
> library(KEGGREST)
> kegg_ids=read.csv("kegg_ids", header = F, stringsAsFactors = F)
> kegg_ids
            V1
1 stm:STM2395 
2 sau:SA0132
3 efa:EF2297

> data.frame(Pathway=unlist(sapply(keggGet(kegg_ids[,1]), "[[", "PATHWAY")),stringsAsFactors = F)
                                                      Pathway
    stm01503 Cationic antimicrobial peptide (CAMP) resistance
    stm02020                             Two-component system
    efa00550                       Peptidoglycan biosynthesis
    efa01100                               Metabolic pathways
    efa01502                            Vancomycin resistance
    efa02020                             Two-component system
ADD REPLYlink modified 18 months ago • written 18 months ago by cpad011211k

Thank you, but its giving pathway id, how we know the corresponding kegg id ?

ADD REPLYlink written 18 months ago by sinumolgeorge10

I want to get the kegg_id along with the result. Are there any options?

ADD REPLYlink written 18 months ago by sinumolgeorge10
1
library(KEGGREST)
library(purrr)
library(magrittr)


kegg_ids=read.csv("kegg_ids", header = F, stringsAsFactors = F)
kegg_ids

kegg_pathways=data.frame(Pathway=unlist(sapply(keggGet(kegg_ids[,1]), "[[", c("PATHWAY"))),stringsAsFactors = F)
kegg_list=map(keggGet(kegg_ids[,1]), extract, c("ENTRY", "PATHWAY"))
library(dplyr)
kegg_df=bind_rows(lapply(kegg_list, function (x) data.frame(t(unlist(x)),stringsAsFactors = F)))
library(tidyr)
kegg_df1=na.omit(gather(kegg_df,key = "", value = Pathway, -ENTRY.CDS))[,c(1,3)]

Note: beware of library loading. Some of the libraries mask the function of others and results in execution issues. output:

> na.omit(gather(kegg_df,key = "", value = Pathway, -ENTRY.CDS))[,c(1,3)]
   ENTRY.CDS                                          Pathway
1    STM2395 Cationic antimicrobial peptide (CAMP) resistance
4    STM2395                             Two-component system
9     EF2297                       Peptidoglycan biosynthesis
12    EF2297                               Metabolic pathways
15    EF2297                            Vancomycin resistance
18    EF2297                             Two-component system
ADD REPLYlink modified 18 months ago • written 18 months ago by cpad011211k

Error in UseMethod("extract_") : no applicable method for 'extract_' applied to an object of class "list"

ADD REPLYlink written 18 months ago by sinumolgeorge10
1

This is because of one library masking the function of another library. A note was added in between regarding the same. Following is the updated code with correct order of libraries:

library(KEGGREST)
kegg_ids=read.csv("test.txt", header = F, stringsAsFactors = F)
library(purrr)
library(magrittr)
kegg_list=map(keggGet(kegg_ids[,1]), extract, c("ENTRY", "PATHWAY"))
library(dplyr)
kegg_df=bind_rows(lapply(kegg_list, function (x) data.frame(t(unlist(x)),stringsAsFactors = F)))
library(tidyr)
kegg_df1=na.omit(gather(kegg_df,key = "", value = Pathway, -ENTRY.CDS))[,c(1,3)]
ADD REPLYlink modified 18 months ago • written 18 months ago by cpad011211k

Thank you very much

ADD REPLYlink written 18 months ago by sinumolgeorge10

I have used GENE NAME instead of pathway information for getting corresponding gene name. But it's not fetching output. Is there any other keyword for fetching corresponding gene name of KEGG id?

ADD REPLYlink written 18 months ago by sinumolgeorge10

NAME for gene name

library(KEGGREST)
library(purrr)
library(magrittr)

kegg_ids=read.csv("test.txt", header = F, stringsAsFactors = F)
library(biomaRt)

kegg_pathways=data.frame(Pathway=unlist(sapply(keggGet(kegg_ids[,1]), "[[", c("PATHWAY"))),stringsAsFactors = F)
kegg_list=map(keggGet(kegg_ids[,1]), extract, c("ENTRY", "PATHWAY", "NAME"))
library(dplyr)
kegg_df=bind_rows(lapply(kegg_list, function (x) data.frame(t(unlist(x)),stringsAsFactors = F)))
library(tidyr)
na.omit(gather(kegg_df,key = "", value = Pathway, -c(ENTRY.CDS,NAME)))[,c(1,2,4)]

output:

> na.omit(gather(kegg_df,key = "", value = Pathway, -c(ENTRY.CDS,NAME)))[,c(1,2,4)]
   ENTRY.CDS  NAME                                          Pathway
1    STM2395  pgtE Cationic antimicrobial peptide (CAMP) resistance
4    STM2395  pgtE                             Two-component system
9     EF2297 vanYB                       Peptidoglycan biosynthesis
12    EF2297 vanYB                               Metabolic pathways
15    EF2297 vanYB                            Vancomycin resistance
18    EF2297 vanYB                             Two-component system
ADD REPLYlink modified 18 months ago • written 18 months ago by cpad011211k

Thanks! Very useful. How if I want to list out all the results even with the ones that do not have any pathways but still have name or definition?

ADD REPLYlink modified 10 months ago • written 10 months ago by hadiaziz0
0
gravatar for EagleEye
18 months ago by
EagleEye6.2k
Sweden
EagleEye6.2k wrote:

You can download all KEGG pathways with ids, description and corresponding genes involved as a simple table (plain text file) using GeneSCF.

ADD COMMENTlink written 18 months ago by EagleEye6.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1013 users visited in the last hour