Question: How do I extract genes from a KEGG pathway
1
gravatar for ammasakshay
7 weeks ago by
ammasakshay20
ammasakshay20 wrote:

Using R I want to generate a list of genes only (without the accompanying text) from a pathway.

For example. If the input pathway is KEGG prostate cancer, I want my output to be a .csv list of the genes in that pathway. I tried:

library("KEGGREST")

keggGet("hsa05215")[[1]]$GENE

but that gives me a list of the number and gene description along with the gene symbol and I want a list consisting of the gene symbol alone.

How do I get this?

Thank you.

bioconductor R gene • 191 views
ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by ammasakshay20

Append all the hsa id's to the below URL and get the result. Later you can parse the webpage output.

Ex: http://rest.kegg.jp/get/hsa:04140+hsa:04510+hsa:04919

Hope this solves your problem.

ADD REPLYlink written 7 weeks ago by suniltakekar0

I did that here, http://rest.kegg.jp/get/hsa05215 and the gene list is similar to the one i got through R. It has a number gene symbol and description on each line. I want the gene symbol alone. The output under gene looks like this: 1027 CDKN1B; cyclin dependent kinase inhibitor 1B [KO:K06624] 1017 CDK2; cyclin dependent kinase 2 [KO:K02206] [EC:2.7.11.22] 898 CCNE1; cyclin E1 [KO:K06626] I want something that looks like CDKN1B CDK2 CCNE1

ADD REPLYlink written 7 weeks ago by ammasakshay20

Try this http://rest.kegg.jp/list/hsa:90011+hsa:4550+hsa:2576

Refer this https://www.kegg.jp/kegg/rest/keggapi.html

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by suniltakekar0

This doesn't help. This is only for genes and does not give us the gene list in pathways. Try your solution with hsa05215 pathway and you will see it does not return the list of 90+ genes.

ADD REPLYlink written 7 weeks ago by ammasakshay20
2
gravatar for ammasakshay
7 weeks ago by
ammasakshay20
ammasakshay20 wrote:

I ended up solving it myself. Hopefully this helps anyone who has a similar need.

library("KEGGREST")

#Get the list of numbers, gene symbols and gene description
names <- keggGet("hsa05215")[[1]]$GENE
#Delete the gene number by deleting every other line
namesodd <-  names[seq(0,length(names),2)]
#Create a substring deleting everything after the ; on each line (this deletes the gene description).
namestrue <- gsub("\\;.*","",namesodd)
#export the vector as a csv
write.csv(namestrue, file = "hsa05215",quote = F, row.names = F)
ADD COMMENTlink modified 7 weeks ago by WouterDeCoster38k • written 7 weeks ago by ammasakshay20

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLYlink written 7 weeks ago by WouterDeCoster38k

Thank you. I'm a little new to this interface.

ADD REPLYlink written 7 weeks ago by ammasakshay20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1941 users visited in the last hour