Question: Ensembl id to GeneSymbol with biomart
gravatar for Bioinfo
9 months ago by
Bioinfo200 wrote:


I have 3224 Ensembl id's as rownames in a dataframe "G". To convert Ensembl ids into Genesymbols I used biomart like following.

mart <- useDataset("hsapiens_gene_ensembl", useMart("ensembl"))
genes <- rownames(G)
G <-G[,-6]
G_list <- getBM(filters= "ensembl_gene_id", attributes= c("ensembl_gene_id"                                                     "hgnc_symbol"),values=genes,mart= mart)

Now in G_list I can see only 3200 ensembl ids showing Genesymbols / No Gene_symbols. Why the other 24 ensembl ids are not seen in G_list? If there are no gene_symbol for those 24 ensembl ids it should atleast show "-"

what is the problem here?

biomart • 545 views
ADD COMMENTlink modified 9 months ago by sandeep.amberkar180 • written 9 months ago by Bioinfo200
gravatar for sandeep.amberkar18
9 months ago by
sandeep.amberkar180 wrote:

More often there are many to many relationships between Ensembl ids and HGNC symbols, which is why it is very tedious to obtain exact gene symbols. It is better to use the mapIds function in to have those relations. I wrote a nifty function to identify these 1:1 mappings. It returns a list with 2 elements; 1st element is a data frame with 1:1 mapped ids, 2nd element are the unmapped ids, which you can remove from your dataset, if required.

Hope it helps!

  idmap=mapIds(x =,keys = IDs,column = IDTo,keytype = IDFrom,multiVals = "first")
  idmap_df=data.frame("From"=names(idmap),"To"=unlist(unname(idmap)),stringsAsFactors = F)

ADD COMMENTlink written 9 months ago by sandeep.amberkar180
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1466 users visited in the last hour