Question: What is a gene transcript?
gravatar for Dave
3.5 years ago by
Dave40 wrote:


I've recently started using Ensembl for help in designing a gene panel for NGS.

Each time I select a gene on Ensembl it comes up with different available transcripts for that gene.

What defines a transcript? Is it literally just a different version of the gene in different individuals (but if so, how come some of the transcripts appear to be so different in terms of bp length and amino acid number)

Any help in explaining would be most gratefully received!

ADD COMMENTlink modified 3.5 years ago by mirza120 • written 3.5 years ago by Dave40

In addition to all answers,

A: How to check the type of identifier from ensembl?

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by EagleEye6.7k
gravatar for Devon Ryan
3.5 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

A single gene can produce multiple different RNAs (i.e., transcripts). The actual transcript observed will depend on the tissue, developmental time point, and environmental/hormonal/etc. factors. Typically there's a single major transcript expressed in a given cell at a given time, but not always. For further details check out wikipedia.

ADD COMMENTlink written 3.5 years ago by Devon Ryan96k

Thanks for the reply Devon!

Just to confirm that i have understood...: the sequence i am reading when i pick a transcript is not in fact the genomic sequence from the reference genome, but represents the RNA transcript from that gene? if this is the case then why do the transcripts include the introns also, is it supposed to reflect pre-splicing?

ADD REPLYlink written 3.5 years ago by Dave40

The transcript sequence usually doesn't include intronic sequence (though an intron in one transcript can be an exon in another). If it does then it's pre-mRNA.

ADD REPLYlink written 3.5 years ago by Devon Ryan96k
gravatar for dariober
3.5 years ago by
WCIP | Glasgow | UK
dariober11k wrote:

This is more of a comment, possibly a rant. In my opinion the concept of gene is obsolete and we would be better off if we ditched the concept of "gene" altogether.

Genes made sense when it was thought that there were these discrete units (genes) which produced each a single transcript and a single protein. However, a single gene can produce multiple transcripts and these can be very different one from the other (see for example the table of transcripts of the Actin gene, there are several coding and non coding transcripts). So when we say gene X has a mutation it is not clear what we are referring to. How many transcripts are affected by this mutation? Are they coding or pseudogenes? In my opinion it would be simpler to think in "transcript space" and forget about genes.

Another way if seeing this is to consider that while transcripts exist, genes don't exist. When you do mRNA extraction you isolate transcripts, not genes, you can't isolate genes. You could have a restriction enzyme that cuts left and right of a "gene", but this is also not true. You have an enzyme that cuts at positions A and B, which happen to include a bunch of transcripts. If the definition of that gene (and transcripts) change, the enzyme still cuts there because it doesn't "see" genes, it sees DNA sequence.

I think "genes" hang around because they make some statements simpler ("Mutation at position A hits gene X" as opposed to "Mutation at position A hits an intron of transcripts X, an exon of transcript Y, and a UTR of Z"), but they are at best incomplete statements.

Any thoughts?

ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by dariober11k

The original definition of a gene comes from the field of classical genetics where it is the unit of inheritance of a phenotypic trait. So any statement about mutation in gene X is actually perfectly clear from that point of view. The problem is to link this definition to its molecular underpinning. It seems different people do it differently. I like the definition used by EnsEMBL that a gene is a locus that produces a set of similar and functionally-related transcripts. I think the notion of gene is still useful despite its fuzziness not least because it is a convenient level of data integration as long as one is careful to not mix definitions.

ADD REPLYlink written 3.5 years ago by Jean-Karim Heriche23k

I also like the EnsEMBL definition you quote. Do you have a link for the original definition ? I couldn't find it after 5 min of googling.

ADD REPLYlink written 3.5 years ago by Carlo Yague5.0k

It's in the section describing the genome annotation pipeline although it is not exactly stated like I did. I first got this definition from a discussion with someone at the EBI some years ago but this is also apparent from the way the EnsEMBL data is structured. If you're interested in digging into this issue, you may also want to read this paper which reviews the gene definition history and concludes by suggesting a new definition:

The gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products.

which to me is another way of stating the definition I mentioned above.

ADD REPLYlink written 3.5 years ago by Jean-Karim Heriche23k
gravatar for mirza
3.5 years ago by
mirza120 wrote:

To understand this in general terms, information in DNA is encoded into RNA which is also called as transcript (as genomic information is "transcribed" into this) which directs the protein formation. "Gene" is a genomic sequence that consists of 2 types of regions- sequences that are expressed or encoded as RNA i.e. transcript, called as exons alternating with regions that are not encoded in the RNA/transcript called as introns. There are several introns & exons in a gene (I1-E1-I2-E2-I3-E3-I4-E4-I5) and different combinations of exons get encoded into the RNA/ transcript e.g. E1-E2-E3 or E1-E3-E4 or E1-E2 depending on the tissue etc. as written above, hence the difference in sequences and lengths.

ADD COMMENTlink written 3.5 years ago by mirza120
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1276 users visited in the last hour