Question: Map gene names to gene ids.
0
gravatar for kandoigaurav
4.6 years ago by
kandoigaurav120
United States
kandoigaurav120 wrote:

I've the some gene names (eg: 11-cis-retinol dehydrogenase, D-2-hydroxyacid dehydrogenase (NAD+), 3alpha-hydroxysteroid 3-dehydrogenase) and their corresponding EC number. I want to map these gene names to other gene ids like EntrezGene ID, Ensembl ID etc.

The number of entries are >1k, so I can't do manual annotation for these.

Can someone suggest a way to map these names to ids?

ADD COMMENTlink written 4.6 years ago by kandoigaurav120
0
gravatar for komal.rathi
4.6 years ago by
komal.rathi3.4k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.4k wrote:

Have you ever used Biomart? These genes belong to which organism? Which fields are you looking for exactly?

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by komal.rathi3.4k

Yep. It doesn't limit the query to Gene names.

ADD REPLYlink written 4.6 years ago by kandoigaurav120

It does. For e.g. HGNC Symbol or WikiGene Name. However, like Devon Ryan suggested, using the correct AnnotationDbi package would be more appropriate & will give more "accurate" results.

ADD REPLYlink written 4.6 years ago by komal.rathi3.4k

I want to convert the gene names and not the symbols.

ADD REPLYlink written 4.6 years ago by kandoigaurav120

Got it! Use Bioconductor. It will give you Gene Symbols, Entrez ID, Ensembl Gene ID etc for your Gene Names. 

ADD REPLYlink written 4.6 years ago by komal.rathi3.4k

Okay. Sounds cool. Lemme try. Thanks

ADD REPLYlink written 4.6 years ago by kandoigaurav120

Umm, can you tell me how to implement the package?

ADD REPLYlink written 4.6 years ago by kandoigaurav120
1

The general idea is to make a character vector of gene names that you want to look up and then do something like select(org.Mm.eg.db, keys=genes, columns=c("SYMBOL","ENTREZID","ENSEMBL"), keytype="GENENAME") will look for the gene symbol, entrez ID, and Ensembl ID associated with each gene name in the genes vector. Note that this isn't a bullet-proof method. For example, it won't find any of your examples because it's expecting other names. "11-cis-retinol dehydrogenase" is also called "retinol dehydrogenase 5", for example, and that'll be found. All of these values from from entrez, so there aren't mappings to every possible name.

If this doesn't work, I'd try something from this thread: Gene Id Conversion Tool

ADD REPLYlink written 4.6 years ago by Devon Ryan89k

You can also try the EC IDs, which are called ENZYME with AnnotationDbi. That might end up working a bit better.

ADD REPLYlink written 4.6 years ago by Devon Ryan89k
0
gravatar for Devon Ryan
4.6 years ago by
Devon Ryan89k
Freiburg, Germany
Devon Ryan89k wrote:

Have you tried the appropriate AnnotationDbi package in Bioconductor (e.g., org.Mm.eg.db or org.Hs.eg.db)?

ADD COMMENTlink written 4.6 years ago by Devon Ryan89k

No. Does it converts the gene name to any other ID? I'vent used Bioconductor yet.

ADD REPLYlink written 4.6 years ago by kandoigaurav120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 694 users visited in the last hour