Question: How do I extract genes from a KEGG pathway
2
gravatar for ammasakshay
9 months ago by
ammasakshay30
ammasakshay30 wrote:

Using R I want to generate a list of genes only (without the accompanying text) from a pathway.

For example. If the input pathway is KEGG prostate cancer, I want my output to be a .csv list of the genes in that pathway. I tried:

library("KEGGREST")

keggGet("hsa05215")[[1]]$GENE

but that gives me a list of the number and gene description along with the gene symbol and I want a list consisting of the gene symbol alone.

How do I get this?

Thank you.

bioconductor R gene • 558 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by ammasakshay30

Append all the hsa id's to the below URL and get the result. Later you can parse the webpage output.

Ex: http://rest.kegg.jp/get/hsa:04140+hsa:04510+hsa:04919

Hope this solves your problem.

ADD REPLYlink written 9 months ago by suniltakekar0

I did that here, http://rest.kegg.jp/get/hsa05215 and the gene list is similar to the one i got through R. It has a number gene symbol and description on each line. I want the gene symbol alone. The output under gene looks like this: 1027 CDKN1B; cyclin dependent kinase inhibitor 1B [KO:K06624] 1017 CDK2; cyclin dependent kinase 2 [KO:K02206] [EC:2.7.11.22] 898 CCNE1; cyclin E1 [KO:K06626] I want something that looks like CDKN1B CDK2 CCNE1

ADD REPLYlink written 9 months ago by ammasakshay30

Try this http://rest.kegg.jp/list/hsa:90011+hsa:4550+hsa:2576

Refer this https://www.kegg.jp/kegg/rest/keggapi.html

ADD REPLYlink modified 9 months ago • written 9 months ago by suniltakekar0

This doesn't help. This is only for genes and does not give us the gene list in pathways. Try your solution with hsa05215 pathway and you will see it does not return the list of 90+ genes.

ADD REPLYlink written 9 months ago by ammasakshay30
2
gravatar for ammasakshay
9 months ago by
ammasakshay30
ammasakshay30 wrote:

I ended up solving it myself. Hopefully this helps anyone who has a similar need.

library("KEGGREST")

#Get the list of numbers, gene symbols and gene description
names <- keggGet("hsa05215")[[1]]$GENE
#Delete the gene number by deleting every other line
namesodd <-  names[seq(0,length(names),2)]
#Create a substring deleting everything after the ; on each line (this deletes the gene description).
namestrue <- gsub("\\;.*","",namesodd)
#export the vector as a csv
write.csv(namestrue, file = "hsa05215",quote = F, row.names = F)
ADD COMMENTlink modified 9 months ago by WouterDeCoster42k • written 9 months ago by ammasakshay30

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLYlink written 9 months ago by WouterDeCoster42k

Thank you. I'm a little new to this interface.

ADD REPLYlink written 9 months ago by ammasakshay30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1978 users visited in the last hour