Question: BioMart up-to-date with NCBI ?
0
gravatar for lindfors.erno
4.0 years ago by
Germany
lindfors.erno20 wrote:

Hi all,

I want to convert mouse (MGI) gene symbols to entrez gene ids by using BioMart's R interface (http://www.bioconductor.org/packages/release/bioc/html/biomaRt.html).
I every now and then come across a mouse gene symbol for which BioMart does not find an entrez gene id but interestingly I can find an entrez gene id in the NCBI web site.

For example I use the following R code to try to find an entrez gene id for a gene symbol "0610009E02Rik":
library("biomaRt")
ensembl=useMart("ensembl")
ensembl = useDataset("mmusculus_gene_ensembl", mart=ensembl)
geneSymbs = c("0610009E02Rik")
geneSymbsEntrezGenes <- getBM(attributes=c('mgi_symbol', 'entrezgene'), filters='mgi_symbol', values=geneSymbs, mart=ensembl)


Then I can see it did not find an entrez gene id by giving the following R command:
> geneSymbsEntrezGenes
     mgi_symbol entrezgene
1 0610009E02Rik         NA


However I can find an entrez gene id for this gene symbol in the NCBI web site:
http://www.ncbi.nlm.nih.gov/gene/?term=0610009E02Rik

So, to me it seems the version of BioMart (biomaRt_2.18.0) I am using is not up-to-date with NCBI.
Is it perhaps so that the BioMart is compiled periodically (e.g once a month, every second month) from NCBI?
If this is the case, should I perhaps just access NCBI directly and forget BioMart if I want to be sure I get most up-to-date conversions?

Or am I perhaps using an out-dated version of BioMart?

Thanks,
Erno Lindfors

ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by lindfors.erno20

It looks like you want to map MGI accessions to entrez gene IDs, I've had trouble getting consistent results from BioMart at times. I'm seeing what looks like the problem you're having using the biomart webtool on the ensembl website.

You might want to get the accessions directly from Jax: ftp://ftp.informatics.jax.org/pub/reports/index.html#marker

There's a file listed there called "MGI Marker associations to Entrez Gene (tab-delimited)", this might be an easier way of getting the most up to date data.

ADD REPLYlink written 4.0 years ago by pld4.8k
3
gravatar for Emily_Ensembl
4.0 years ago by
Emily_Ensembl18k
EMBL-EBI
Emily_Ensembl18k wrote:

BioMart comes from Ensembl. That means that you're getting the current Ensembl data from BioMart. Ensembl map MGI IDs to Ensembl genes and Entrez IDs to Ensembl genes. That means that when you're converting an MGI ID to an Entrez ID you're actually converting MGI->Ensembl->Entrez, so if any of those steps are missing, you won't get the conversion. The second point is that BioMart will be accessing the current Ensembl, so will be fetching our most recent data freeze of MGI and Entrez. Anything newer will not be picked up.

ADD COMMENTlink written 4.0 years ago by Emily_Ensembl18k

Good to know that it always maps things through ensembl stable IDs.

ADD REPLYlink written 4.0 years ago by pld4.8k
0
gravatar for lindfors.erno
4.0 years ago by
Germany
lindfors.erno20 wrote:

Thanks Joe and Emily both for your comments!
I tested the file in Jax web site indeed seems to contain most up-to-date conversions.

With best regards,
Erno Lindfors

ADD COMMENTlink written 4.0 years ago by lindfors.erno20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 686 users visited in the last hour