How to do simple pathway analysis?
1
1
Entering edit mode
5.5 years ago
Seq225 ▴ 110

I have a set of 100 genes (the AA sequences) from a non-model plant species. Now I want to simply identify which pathways they belong to. I want to generate a typical KEGG style image/map highlighting my genes in the entire pathway.

I want to identify my genes that cluster in the same pathway.

How can I do that? I don't know much about KEGG or any other tools.

Thanks a lot!!

rna-seq genome gene • 2.6k views
ADD COMMENT
0
Entering edit mode

you can use simply use web based kegg database and also you can use david tool as well

ADD REPLY
2
Entering edit mode
5.5 years ago
AK ★ 2.2k

Hi Seq225, I did it this way:

  1. Annotate the AA sequences using the web service BlastKOALA (see Step-by-step Instructions);
  2. Download the hierarchical data containing KOs and pathways from KEGG and save as "KO.keg";
  3. Parse "KO.keg" into tidy format ("KO.tsv");
  4. Join your KO assignment results (from step1) with "KO.tsv", then you'll see which genes are clustered in the same pathway.
$ wget -O KO.keg "http://www.genome.jp/kegg-bin/download_htext?htext=ko00001.keg&format=htext&filedir="
$ head KO.keg
+D  KO
#<h2><a href="/kegg/kegg2.html"><img src="/Fig/bget/kegg3.gif" align="middle" border=0></a>   KEGG Orthology (KO)</h2>
!
A09100 Metabolism
B
B  09101 Carbohydrate metabolism
C    00010 Glycolysis / Gluconeogenesis [PATH:ko00010]
D      K00844  HK; hexokinase [EC:2.7.1.1]
D      K12407  GCK; glucokinase [EC:2.7.1.2]
D      K00845  glk; glucokinase [EC:2.7.1.2]

$ head KO.tsv
09100   Metabolism  09101   Carbohydrate metabolism 00010   Glycolysis / Gluconeogenesis    K00844  HK  hexokinase  EC:2.7.1.1
09100   Metabolism  09101   Carbohydrate metabolism 00010   Glycolysis / Gluconeogenesis    K12407  GCK glucokinase EC:2.7.1.2
09100   Metabolism  09101   Carbohydrate metabolism 00010   Glycolysis / Gluconeogenesis    K00845  glk glucokinase EC:2.7.1.2
09100   Metabolism  09101   Carbohydrate metabolism 00010   Glycolysis / Gluconeogenesis    K01810  GPI, pgi    glucose-6-phosphate isomerase   EC:5.3.1.9
09100   Metabolism  09101   Carbohydrate metabolism 00010   Glycolysis / Gluconeogenesis    K06859  pgi1    glucose-6-phosphate isomerase, archaeal EC:5.3.1.9
09100   Metabolism  09101   Carbohydrate metabolism 00010   Glycolysis / Gluconeogenesis    K13810  tal-pgi transaldolase / glucose-6-phosphate isomerase   EC:2.2.1.2 5.3.1.9
09100   Metabolism  09101   Carbohydrate metabolism 00010   Glycolysis / Gluconeogenesis    K15916  pgi-pmi glucose/mannose-6-phosphate isomerase   EC:5.3.1.9 5.3.1.8
09100   Metabolism  09101   Carbohydrate metabolism 00010   Glycolysis / Gluconeogenesis    K00850  pfkA, PFK   6-phosphofructokinase 1 EC:2.7.1.11
09100   Metabolism  09101   Carbohydrate metabolism 00010   Glycolysis / Gluconeogenesis    K16370  pfkB    6-phosphofructokinase 2 EC:2.7.1.11

Hope it helps.

ADD COMMENT
2
Entering edit mode

I revisited it and found a simpler way:

  • Submit your AA sequences to KofamKOALA;
  • When the job is done you'll receive an URL for the report. In that web page report, click "KEGG Mapper" followed by "Reconstruct Pathway". You'll see your AA sequences clustered under each pathway:

Pathway Reconstruction

ADD REPLY
0
Entering edit mode

Hi AK,

I downloaded the KO.keg file. I was wondering what tool(s) you used to generate the "KO.tsv" file as you showed here. Thanks.

ADD REPLY
0
Entering edit mode

Hi he1k,

Please use the following script which downloads and parses the keg data on the fly: keg_hierarchy_parser.py

Hope this helps!

ADD REPLY
0
Entering edit mode

Hi AK,

Thanks very much for sharing the script. It is very helpful. I really appreciate it.

ADD REPLY

Login before adding your answer.

Traffic: 822 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6