What is the difference between transcript id and Ensembl gene id
1
0
Entering edit mode
9.0 years ago
feng0049 • 0

Dear Biostars community, I am very new to genetic analysis. I have just finished extract human RNA gene read counts from fastq files to raw count files in order to conduct differential analysis in edgeR package. This link describes how I did that.

However, after I obtained the read counts file, I notice that I got a sequence of lines in the count file like this:

...
uc001adk.4  10
uc001adl.3  0
uc001adm.6  0
uc001ado.4  0
uc001adp.4  0
...

@Pierre Lindenbaum point out to me that ucxxxxxx.x is an transcript id (thank you very much :) ). Also, I have noticed that there are others like:

ENSG00000162367

which are known as ensembl gene id.

May I know what are the differences or connections between these two?

And What is the connection between these different gene id and gene symbols?

Thank you all in advance!

RNA-Seq genome gene • 20k views
ADD COMMENT
0
Entering edit mode

uc001adk.4 is an UCSC gene id

in fact , this is a transcript id

ADD REPLY
0
Entering edit mode

Thanks, I will edit this. So, the transcript id can then be converted to gene id?

ADD REPLY
10
Entering edit mode
9.0 years ago
EagleEye 7.6k

The difference between Ensembl gene and transcript ID is,

1 ) Ensembl ID starts with ENSGxxxx represents a genomic regions (Gene/Gene ID)

2) Ensembl ID starts with ENSTxxxx represents a transcript ID

3) ENSTxxxx is genomic variant or splice variant (Isoform) of corresponding gene with ENSGxxxx ID

4) One gene (ENSGxxxx / GeneSymbol) can have multiple corresponding transcript ID (ENSTxxxx)

Example:

Gene ID: ENSG00000236172 has 67 variants (which means one ENSGxxxx ID or GeneSymbol will have 67 different ENSTxxxx names)

http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000236172;r=2:6615389-6650535

Click: show transcript table button

Annotations:

There are different annotations available from different source like,

a) UCSC

b) RefSeq

c) Ensembl

d) Gencode

etc., each has its own way of naming the gene or transcript locations. You should be consistent in using single annotation throughout your analysis to avoid confusion.

ADD COMMENT

Login before adding your answer.

Traffic: 3300 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6