Ensembl gene ID meaning of the underscore
1
0
Entering edit mode
3.9 years ago
nhaus ▴ 300

Hello everybody, I performed a DGE analysis and now I am trying to convert the gene ID to gene names. The problem is that virtually all of my gene IDs look somewhat like this: ENSG00000000003.15_4.

In the list of gene IDs which correspond to the gene name, the gene IDs have this format: ENSG00000000003.15

I used kallisto to quantify my reads and the genome as well as the GTF file from GENCODE (https://www.gencodegenes.org/human/release_33lift37.html)

Can I just delete everything after the _ and proceed? And what is the meaning of the underscore in the gene ID in general?

Any help is appreciated! Thanks!

ensembl geneID • 2.1k views
ADD COMMENT
2
Entering edit mode

Hmm, never came across underscores using GENCODE. Maybe this has to do with the lift from hg38 to hg19. I personally even remove the version numbers, so this probably does not do any harm.

ADD REPLY
2
Entering edit mode

Indeed, it is not part of the official format specification: https://www.ensembl.org/Help/Faq?id=488

You could contact the Ensembl Help Desk.

ADD REPLY
4
Entering edit mode
3.9 years ago
Ben_Ensembl ★ 2.4k

Yes, Kevin and ATpoint are correct. The underscore is not part of the Ensembl stable ID specifications.

Best wishes

Ben Ensembl Helpdesk

ADD COMMENT
0
Entering edit mode

Do you have nonetheless an idea what it could mean? I am surprised that there is no info about this in the internet. Like I said I just downloaded the GTF and genome FASTA from https://www.gencodegenes.org/human/release_33lift37.html , which is widely used to my knowledge. Do you think the GTF file is in some way damaged? Or is it save for me to work with this release (GRCh37, Release 33)?

ADD REPLY
2
Entering edit mode

Ah, sorry - I didn't notice the files from your original post. In the GENCODE GTFs mapping versions are appended to the identifiers (eg. ENSG00000228327.3_2) in the annotation mapped back to GRCh37.

More information can be found in their documentation: https://www.gencodegenes.org/pages/data_format.html

ADD REPLY
1
Entering edit mode

Perfect! Thank you very much for pointing that out and taking the time to answer!

ADD REPLY

Login before adding your answer.

Traffic: 2644 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6