Question: How to convert current Ensembl Gene ID into older versions
gravatar for
2.4 years ago by wrote:

Hi there, I am currently analyzing the singling pathway with lncRNA2Function. But some of the lncRNAs can not be recognized by the datebase. It showed this message "The following lncRNAs that you input could not be found in the Ensembl v70 (GENCODE v15) and were excluded from the enrichment analysis". My concern is that whether the Ensembl Gene ID is different between the old Ensembl version and the current version. If it is, how can I convert the current version into the old version? Thanks!

ensembl gene • 1.3k views
ADD COMMENTlink modified 2.4 years ago by Emily_Ensembl16k • written 2.4 years ago by

You are comparing distantly related Ensembl datasets, from release 70 (January 2013) to current release 84 (March 2016). The Ensembl Gene ID can (will) be different between older and current versions of Ensembl. This is mandatory when there have been changes in the structure of the lincRNA gene model, even if those were minor changes (e.g. extending the first or last exon). There are a few scenarios here:

  • Gene structure that is exactly the same between v84 and v70. The ENSG ID will be the same. Conversion: not needed. Happy days.

  • More genes in v84 due to newer annotation in the latest assembly. This means these are absent from v70. Conversion: not possible. Oh well, that's life days.

  • Genes in v70, which annotation was not confirmed/supported in v84. These old genes have been deprecated (e.g. ENSG00000232274). Conversion: not needed. Worrying days. Should one trust this old annotation and a newer more updated version is available?

  • Gene present in v84 and v70 but with slightly different structures. They will have different ENSG IDs. Conversion: mandatory. Busy days.

Let's focus on point (4): in addition to the gene symbols suggested by @EagleEye, I'd suggest looking at gene names to catch those cases of your lincRNAs that do not have an HGNC symbol, rather a clone name (e.g. RP11-506F3). You can get both HGNC symbols and clone names in the Ensembl GTF. You could also try to convert the coordinates of the lincRNA genes from GRCh38 (v84) to GRCh37 (v70).

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by Denise - Open Targets4.7k
gravatar for EagleEye
2.4 years ago by
EagleEye6.0k wrote:
  1. Have you tried with Gene Symbols ? It might work effectively with that. Also try by removing revision numbers in the end of the gene name. Example 'ENSG00000228630' instead of 'ENSG00000228630.2'.

  2. Since lncRNA2function study was based on Gencode v15, there is a possibility that it does not include the new lncRNAs.

The lncRNAs are increasing day by day and there is huge difference between v15 and current versions or the latest version based on hg19 (gencode v19). It means whatever RNAs lncRNA2function says does not match means, it was not there when they carried out the study.

Example: Number of lincRNA class in Gencode v15 is 6,458 and v19 is 7,114 ( other classes also have huge difference)

I could not also find any annotation update history from lncRNA2Function website. The predictions are still based on v15.

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by EagleEye6.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2099 users visited in the last hour