Question: What is the difference between transcript id and Ensembl gene id
0
gravatar for feng0049
3.3 years ago by
feng00490
feng00490 wrote:

Dear Biostars community, I am very new to genetic analysis. I have just finished extract human RNA gene read counts from fastq files to raw count files in order to conduct differential analysis in edgeR package. This link describes how I did that.

However, after I obtained the read counts file, I notice that I got a sequence of lines in the count file like this:

...
uc001adk.4  10
uc001adl.3  0
uc001adm.6  0
uc001ado.4  0
uc001adp.4  0
...

@Pierre Lindenbaum point out to me that ucxxxxxx.x is an transcript id (thank you very much :) ). Also, I have noticed that there are others like:

ENSG00000162367

which are known as ensembl gene id.

May I know what are the differences or connections between these two?

And What is the connection between these different gene id and gene symbols?

Thank you all in advance!

rna-seq gene genome • 6.0k views
ADD COMMENTlink modified 3.3 years ago by EagleEye6.5k • written 3.3 years ago by feng00490

uc001adk.4 is an UCSC gene id

in fact , this is a transcript id

ADD REPLYlink written 3.3 years ago by Pierre Lindenbaum123k

Thanks, I will edit this. So, the transcript id can then be converted to gene id?

ADD REPLYlink written 3.3 years ago by feng00490
6
gravatar for EagleEye
3.3 years ago by
EagleEye6.5k
Sweden
EagleEye6.5k wrote:

The difference between Ensembl gene and transcript ID is,

1 ) Ensembl ID starts with ENSGxxxx represents a genomic regions (Gene/Gene ID)

2) Ensembl ID starts with ENSTxxxx represents a transcript ID

3) ENSTxxxx is genomic variant or splice variant (Isoform) of corresponding gene with ENSGxxxx ID

4) One gene (ENSGxxxx / GeneSymbol) can have multiple corresponding transcript ID (ENSTxxxx)

Example:

Gene ID: ENSG00000236172 has 67 variants (which means one ENSGxxxx ID or GeneSymbol will have 67 different ENSTxxxx names)

http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000236172;r=2:6615389-6650535

Click: show transcript table button

Annotations:

There are different annotations available from different source like,

a) UCSC

b) RefSeq

c) Ensembl

d) Gencode

etc., each has its own way of naming the gene or transcript locations. You should be consistent in using single annotation throughout your analysis to avoid confusion.

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by EagleEye6.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1021 users visited in the last hour