Question: GAGE - eg2sym (ID conversion)
0
gravatar for mbk0asis
2.2 years ago by
mbk0asis380
Korea, Republic Of
mbk0asis380 wrote:

Hello.

I'm trying to run the KEGG pathway analysis using "GAGE" and having trouble of running it because ID conversion isn't working for me.

First of all, I'm dealing with "cow" RNA-seq data.

Here's the R code I used...

kg.bta=kegg.gsets("bta")
kegg.gs=kg.bta$kg.sets[kg.bta$sigmet.idx]
kegg.gs.sym <- lapplykegg.gs,eg2sym)
lapply(kegg.gs.sym[1:3],head)

and the results are...

$`bta00970 Aminoacyl-tRNA biosynthesis`
[1] NA NA NA NA NA NA

$`bta02010 ABC transporters`
[1] NA NA NA NA NA NA

$`bta03008 Ribosome biogenesis in eukaryotes`
[1] NA NA NA NA NA NA

I guess the "eg2sym" only works for "human".

I know I could also convert IDs on my RNA-seq data to Entrez, but I experienced a loss of about 5000 genes after conversion. And I'm worrying the loss of genes might distort the results.

So, I wonder if I can use "eg2sym" on cow data, and if not, how I can do it.

Thank you!

rna-seq gsea kegg gage • 1.1k views
ADD COMMENTlink modified 2.1 years ago by Guangchuang Yu2.0k • written 2.2 years ago by mbk0asis380
3
gravatar for bigmawen
2.1 years ago by
bigmawen270
United States
bigmawen270 wrote:

You used eg2sym function, which only works for human data. Pathview package provides a set of more general gene ID conversion functions, i.e. eg2id, id2eg, geneannot.map etc. these functions work for 19 major research species. For more details:

library(pathview)
?eg2id

The following code would work for you.

kegg.gs.sym <- lapplykegg.gs,function(x){
syms=eg2id(x, org="Bt", category =”symbol”)
return(syms[,2])
})

Result shown:

> lapply(kegg.gs.sym[1:3],head)
$`bta00970 Aminoacyl-tRNA biosynthesis`
[1] "EARS2" "VARS2" "SARS"  "WARS"  "YARS"  "SARS2"

$`bta02010 ABC transporters`
[1] "LOC100296627" "ABCA2"        "ABCC9"        "ABCB4"        "ABCC6"       
[6] "LOC101909228"

$`bta03008 Ribosome biogenesis in eukaryotes`
[1] "RMRP"    "SPATA5"  "RN28S1"  "RN5-8S1" "AK6"     NA
ADD COMMENTlink written 2.1 years ago by bigmawen270

Thank you! It works perfectly.

ADD REPLYlink written 2.1 years ago by mbk0asis380
2
gravatar for Guangchuang Yu
2.1 years ago by
Guangchuang Yu2.0k
China/Hong Kong/The University of Hong Kong
Guangchuang Yu2.0k wrote:

bitr in clusterProfiler is another choice for you.

> require(org.Bt.eg.db)
> sample_eg = sample(keys(org.Bt.eg.db), 100)
> head(sample_eg)
[1] "104971204" "107131358" "107131847" "788587"    "789592"    "519105"
> require(clusterProfiler)
> eg2sym = bitr(sample_eg, fromType='ENTREZID', toType="SYMBOL", OrgDb=org.Bt.eg.db)
> head(eg2sym)
   ENTREZID       SYMBOL
1 104971204 LOC104971204
2 107131358 LOC107131358
3 107131847 LOC107131847
4    788587    LOC788587
5    789592    LOC789592
6    519105    LOC519105
> tail(eg2sym)
     ENTREZID       SYMBOL
95  104977419       CALEST
96  104974952 LOC104974952
97  107080383       INSINT
98  100336249 LOC100336249
99     504224  C15H11orf16
100    529919    LOC529919
ADD COMMENTlink written 2.1 years ago by Guangchuang Yu2.0k
0
gravatar for EagleEye
2.2 years ago by
EagleEye5.7k
Sweden
EagleEye5.7k wrote:

Have you tried GeneSCF.

Here is the example for retrieving sheep and cow KEGG as simple text file,

A: Gene ontology in sheep

For enrichment analysis use,

./geneSCF -m=update -i=INPUTgene.list -t=gid -db=KEGG -o=/ExistingOUTPUTfolder/ -org=bta --plot=yes --background=#NumberOfBackgroundGenes

For complete information check,

Gene Set Clustering based on Functional annotation (GeneSCF)

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by EagleEye5.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1867 users visited in the last hour