Question: R: download all KEGG pathways including KO and Compounds
0
gravatar for dago
7 weeks ago by
dago2.6k
Germany
dago2.6k wrote:

I saw this question has been asked here and there before. However, I could not find a tool that does the job for me.

I want to download all pathways from KEGG including KO and compounds using R. I would imagine creating an R object like:

$Path_1
...KO
...Compounds
$Path_2
...KO
...Compounds
$Path_3
...KO
...Compounds

Any idea how to download the data?

Thank you

kegg system_biology R • 124 views
ADD COMMENTlink modified 7 weeks ago by ATpoint36k • written 7 weeks ago by dago2.6k
1

all pathways from KEGG including KO and compounds using R.

That would violate their AUP if you don't have a license.

ADD REPLYlink written 7 weeks ago by genomax87k

I did not think about this. I guess I an getting used to have open source tools/db. Thanks

ADD REPLYlink written 7 weeks ago by dago2.6k
2
gravatar for 5heikki
7 weeks ago by
5heikki8.9k
Finland
5heikki8.9k wrote:

You can use their API. However, it is not meant for downloading the entire database. For that there is the ftp which is behind a license

ADD COMMENTlink written 7 weeks ago by 5heikki8.9k

ah right, that is maybe why I could not find any tool doing that!

ADD REPLYlink written 7 weeks ago by dago2.6k
1
gravatar for ATpoint
7 weeks ago by
ATpoint36k
Germany
ATpoint36k wrote:

MSigDB contains the KEGG pathways: https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp Download the gmt file and then load it into R, e.g. with

kegg <- fgsea::gmtPathways("c2.cp.kegg.v7.1.symbols.gmt")

> head(kegg)
$KEGG_GLYCOLYSIS_GLUCONEOGENESIS
 [1] "ACSS2"   "GCK"     "PGK2"    "PGK1"    "PDHB"    "PDHA1"   "PDHA2"   "PGM2"   
 [9] "TPI1"    "ACSS1"   "FBP1"    "ADH1B"   "HK2"     "ADH1C"   "HK1"     "HK3"    
[17] "ADH4"    "PGAM2"   "ADH5"    "PGAM1"   "ADH1A"   "ALDOC"   "ALDH7A1" "LDHAL6B"
[25] "PKLR"    "LDHAL6A" "ENO1"    "PKM"     "PFKP"    "BPGM"    "PCK2"    "PCK1"   
[33] "ALDH1B1" "ALDH2"   "ALDH3A1" "AKR1A1"  "FBP2"    "PFKM"    "PFKL"    "LDHC"   
[41] "GAPDH"   "ENO3"    "ENO2"    "PGAM4"   "ADH7"    "ADH6"    "LDHB"    "ALDH1A3"
[49] "ALDH3B1" "ALDH3B2" "ALDH9A1" "ALDH3A2" "GALM"    "ALDOA"   "DLD"     "DLAT"   
[57] "ALDOB"   "G6PC2"   "LDHA"    "G6PC"    "PGM1"    "GPI"    

$KEGG_CITRATE_CYCLE_TCA_CYCLE
 [1] "IDH3B"    "DLST"     "PCK2"     "CS"       "PDHB"     "PCK1"     "PDHA1"   
 [8] "PDHA2"    "SUCLG2P2" "FH"       "SDHD"     "OGDH"     "SDHB"     "IDH3A"   
[15] "SDHC"     "IDH2"     "IDH1"     "ACO1"     "ACLY"     "MDH2"     "DLD"     
[22] "MDH1"     "DLAT"     "OGDHL"    "PC"       "SDHA"     "SUCLG1"   "SUCLA2"  
[29] "SUCLG2"   "IDH3G"    "ACO2"    

$KEGG_PENTOSE_PHOSPHATE_PATHWAY
 [1] "RPE"     "RPIA"    "PGM2"    "PGLS"    "PRPS2"   "FBP2"    "PFKM"    "PFKL"   
 [9] "TALDO1"  "TKT"     "FBP1"    "TKTL2"   "PGD"     "RBKS"    "ALDOA"   "ALDOC"  
[17] "ALDOB"   "H6PD"    "RPEL1"   "PRPS1L1" "PRPS1"   "DERA"    "G6PD"    "PGM1"   
[25] "TKTL1"   "PFKP"    "GPI"
ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by ATpoint36k

That is actually great! But I am not sure there are compounds here, just name of genes. No?

ADD REPLYlink written 7 weeks ago by dago2.6k

not the best solution because is for single organisms, but genome scale metabolic models (http://bigg.ucsd.edu/data_access) have all the information you need regarding the Gene-Protein-Reaction associations. Once you have the gene id, getting the KO with eggNOG shoudl not be a problem.

ADD REPLYlink written 7 weeks ago by andres.firrincieli630
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1543 users visited in the last hour