Question: Ensembl ID mapping using Biomart
0
gravatar for prithvi.mastermind
4 weeks ago by
prithvi.mastermind0 wrote:

I mapped the Human Ensembl IDs using Biomart package in R. It gave me a one-to-one mapping for Ensembl gene ID to External gene ID. Is this Ensembl gene ID different from Original Ensembl IDs which I had initially in my files?

The original Ensembl IDs had a decimal after the number ends like "ENSG00000280109.1" but on the same hand, its corresponding Ensembl gene ID ("ENSG00000280109") is decimal free which I gave as values in Biomart code. Please explain what is the difference between these two?

And secondly, UCSC Xena browser gives us a list of Ensembl IDs and their corresponding mapped gene symbols, but there are many cases in which the same gene corresponds to different Ensembl IDs.

Which one should I trust? Mapping via Biomart or the mapping file provided by XENA itself.

On what criteria is Biomart assigning a single gene to each Ensembl gene ID/original Ensembl IDs whereas UCSC Xena provides different Ensembl IDs corresponding to the same gene.

gene rna-seq R software error • 122 views
ADD COMMENTlink modified 4 weeks ago by Jean-Karim Heriche23k • written 4 weeks ago by prithvi.mastermind0

Here you can find the answer to your first question about the decimal number in the Ensembl IDs.

ADD REPLYlink written 4 weeks ago by iraun3.8k
1
gravatar for Jean-Karim Heriche
4 weeks ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche23k wrote:

Please explain what is the difference between these two?

The number after the dot indicates a version number and can usually be safely ignored. Actually some software may not recognize Ensembl IDs with version.

there are many cases in which the same gene corresponds to different Ensembl IDs.

Most likely because of different versions. Check which version of Ensembl each of your tools is using. Avoid mixing versions, just pick one and stick to it.

On what criteria is Biomart assigning a single gene to each Ensembl gene ID/original Ensembl IDs

By definition a gene ID in Ensembl corresponds to a single gene. A gene in Ensembl is defined by annotating a reference genome and thus is associated with a chromosomal region. Ensembl maintains its own mapping of external IDs to Ensembl IDs which is what Biomart is using.

whereas UCSC Xena provides different Ensembl IDs corresponding to the same gene.

Probably because Xena is either using a different version of the Ensembl genome annotations and/or is using a different mapping between Ensembl and non-Ensembl IDs then the one used by Ensembl/Biomart.

What you need to understand is that there is no single definition of a gene and each resource has its own.

ADD COMMENTlink written 4 weeks ago by Jean-Karim Heriche23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 665 users visited in the last hour