Question: What is the difference between transcript id and Ensembl gene id
gravatar for feng0049
4.1 years ago by
feng00490 wrote:

Dear Biostars community, I am very new to genetic analysis. I have just finished extract human RNA gene read counts from fastq files to raw count files in order to conduct differential analysis in edgeR package. This link describes how I did that.

However, after I obtained the read counts file, I notice that I got a sequence of lines in the count file like this:

uc001adk.4  10
uc001adl.3  0
uc001adm.6  0
uc001ado.4  0
uc001adp.4  0

@Pierre Lindenbaum point out to me that ucxxxxxx.x is an transcript id (thank you very much :) ). Also, I have noticed that there are others like:


which are known as ensembl gene id.

May I know what are the differences or connections between these two?

And What is the connection between these different gene id and gene symbols?

Thank you all in advance!

rna-seq gene genome • 8.4k views
ADD COMMENTlink modified 4.1 years ago by EagleEye6.6k • written 4.1 years ago by feng00490

uc001adk.4 is an UCSC gene id

in fact , this is a transcript id

ADD REPLYlink written 4.1 years ago by Pierre Lindenbaum129k

Thanks, I will edit this. So, the transcript id can then be converted to gene id?

ADD REPLYlink written 4.1 years ago by feng00490
gravatar for EagleEye
4.1 years ago by
EagleEye6.6k wrote:

The difference between Ensembl gene and transcript ID is,

1 ) Ensembl ID starts with ENSGxxxx represents a genomic regions (Gene/Gene ID)

2) Ensembl ID starts with ENSTxxxx represents a transcript ID

3) ENSTxxxx is genomic variant or splice variant (Isoform) of corresponding gene with ENSGxxxx ID

4) One gene (ENSGxxxx / GeneSymbol) can have multiple corresponding transcript ID (ENSTxxxx)


Gene ID: ENSG00000236172 has 67 variants (which means one ENSGxxxx ID or GeneSymbol will have 67 different ENSTxxxx names);g=ENSG00000236172;r=2:6615389-6650535

Click: show transcript table button


There are different annotations available from different source like,


b) RefSeq

c) Ensembl

d) Gencode

etc., each has its own way of naming the gene or transcript locations. You should be consistent in using single annotation throughout your analysis to avoid confusion.

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by EagleEye6.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 783 users visited in the last hour