Question: Ensembl gene ID conversion to gene name
0
gravatar for Annika Forsingdal
2.5 years ago by
Copenhagen, Denmark
Annika Forsingdal190 wrote:

Hi,

In our RNA-seq pipeline we have a step after mapping that converts ensembl gene ID's (eg ENSMUSG00000096126) to gene names. When the pipeline was builded we mapped read files to older versions of the mouse genome.

What happens when new samples mapped on the newest version of the murine genome is run trough the pipeline? Will we just miss the gene names of the transcripts that were not included in the old built of the murine genome? Are there any additional consequences?

 

Thank you for your time,

Annika

rna-seq reference genome • 1.6k views
ADD COMMENTlink modified 2.5 years ago by EagleEye5.0k • written 2.5 years ago by Annika Forsingdal190
1

I am not sure I understand the problem here. If you have transcript or gene IDs for a given version then use this version to retrieve the gene names, e.g. if you map your reads to e.g. Ensembl v82 then use the v82 BioMART or API to retrieve corresponding gene names. Now if you're trying to use transcript or gene IDs from an older version to find gene names in a newer one, you may indeed have problems such as the IDs not being present in the new version anymore but I don't think you should be doing this. If accurately identifying genes in a new version of Ensembl is critical you should probably remap all your data to that version.

ADD REPLYlink written 2.5 years ago by Jean-Karim Heriche15k
1
gravatar for Michael Dondrup
2.5 years ago by
Bergen, Norway
Michael Dondrup43k wrote:

You should always make sure that the versions of genome built/assembly and annotation are consistent. I don't exactly understand which part of your pipeline is not updated or why, but I suggest to either update everything or nothing. For a comparative analysis I would re-map everything against the latest assembly and gene-models.

Also I wouldn't use gene names, with which you mean gene symbols I guess, except for additional final annotation. Ensembl gene IDs don't change (mostly, except that they could become deleted, or added) and are unique  while gene names are ambiguous and may change.
 

ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by Michael Dondrup43k
0
gravatar for Abdullah
2.5 years ago by
Abdullah90
Germany
Abdullah90 wrote:

Have a look at http://mygene.info/. It is an efficiant way to make the conversion between Gene ID formats. It can be implemented inside a pipline as well.

ADD COMMENTlink written 2.5 years ago by Abdullah90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1100 users visited in the last hour