Why a gene name have more than one ensembl ID in gtf?
1
2
Entering edit mode
2.7 years ago
walker ▴ 30

As described. In ensembl gtf file, I find there are different gene_ids having same gene_name.For example, gene_name is TBCE, and gene_ids are ENSG00000284770 and ENSG00000285053.

ensembl gtf • 1.0k views
ADD COMMENT
0
Entering edit mode

Those two ID' appear to have overlapping loci.

overlapping locus
    exon(s) of the locus overlap exon(s) of a readthrough transcript or a transcript belonging to another locus

Tagging Emily_Ensembl for additional clarification.

ADD REPLY
3
Entering edit mode
2.7 years ago

Looks like ENSG00000285053 is a readthrough of ENSG00000284770 and GGPS1 ENSG00000152904: http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000285053;r=1:235328570-235448952

Readthroughs are annoying because we need RefSeq to agree they exist before HGNC can give them a meaningful name (in this case it would be GGPS1-TBCE). I'll report it to the relevant people, but sadly we might not be able to get it renamed. As it is, it's just taken the name of the gene it has the most sequence similarity to, which is TBCE.

ADD COMMENT
0
Entering edit mode

I see, but I want to know if there is any other situation? For instance, there is no overlap between ENSG00000274559 and ENSG00000234289. However, they are both named "H2BFS" in GTF.

ADD REPLY
0
Entering edit mode

If you have a list you would like us to investigate, please send it in to helpdesk [at] ensembl.org.

ADD REPLY
0
Entering edit mode

Thank you so much, I will try this later.

ADD REPLY

Login before adding your answer.

Traffic: 3141 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6