Hi All, I have gene expression data with ENSEMBL Ids (ENSG00000XXXXXXX). I tried 3 different packages to convert them to ENTREZ IDs (bitr, biomatRt, AnnotationDb), but I consistently get no match for about 5-6% of the genes. I would like to do GO and GSEA, but most GO and GSEA tools require gene symbols or entrez IDs. This problem bugs me for a while already. How to handle this? I work with mouse genes.
Here are the example of what I am doing:
MyTargetList$entrez <- mapIds(org.Mm.eg.db, keys=rownames(IP_toptreatRT3), column ="ENTREZID", keytype="ENSEMBL", multiVals="first")
Or with biomaRt:
ensembl = useMart(biomart = "ensembl", dataset = "mmusculus_gene_ensembl") genemap <- getBM( attributes = c("ensembl_gene_id", "entrezgene"), mart = ensembl )
and then match function to populate the column.
But there seem to be gaps in the databases:
> head(genemap) ensembl_gene_id entrezgene 1 ENSMUSG00000064336 NA 2 ENSMUSG00000064337 NA 3 ENSMUSG00000064338 NA 4 ENSMUSG00000064339 NA 5 ENSMUSG00000064340 NA 6 ENSMUSG00000064341 17716