Question: Does TSS = first exon?
6
gravatar for NHEJ
5.8 years ago by
NHEJ320
United States
NHEJ320 wrote:

Could anyone please clarify whether or not in computational studies one can treat the transcription start site (TSS) as being equivalent to the first exon?  

Some seemingly contradictory quotes (and sources) that I've found on this issue:

"Promoter sequences are usually the sequence immediately upstream the transcription start site (TSS) or first exon." (SOURCE: http://www.protocol-online.org/forums/blog/4/entry-10-from-how-to-find-promoter-sequences/)

"The TSS is the first nucleotide of the UTR (at least I think so, I don't think there's any gene which immediately begins with ATG), so yes, UTRs can also be 'relaxed' and differ in length." (SOURCE: http://www.protocol-online.org/forums/index.php?app=forums&module=forums§ion=printtopic&client=printer&f=1&t=6050)

"No, the first codon of the first exon is the start codon "ATG" which also codes for methionine. This is called the translation start site.  The transcription start site is where the RNA polymerase binds to in the 5' UTR upstream of the start codon. IMHO  Maybe someone else can elaborate more. I dont want to give you the incorrect info."  (SOURCE: http://seqanswers.com/forums/archive/index.php/t-12773.html)

ADD COMMENTlink modified 20 months ago by yuktichopra6750 • written 5.8 years ago by NHEJ320
6
gravatar for Istvan Albert
5.8 years ago by
Istvan Albert ♦♦ 84k
University Park, USA
Istvan Albert ♦♦ 84k wrote:

These terms always need to be defined with respect of an authoritative resource and not from posts on websites. 

This is why the Sequence Ontology exists:

http://www.sequenceontology.org/index.html

For TSS it states the following:

http://www.sequenceontology.org/browser/current_svn/term/SO:0000315

there you can investigate the proper context.

ADD COMMENTlink modified 5.8 years ago • written 5.8 years ago by Istvan Albert ♦♦ 84k

Could you please elaborate on Steve Lianoglou's comment below?  

ADD REPLYlink written 5.8 years ago by NHEJ320

That comment was attached to my answer which I should not have posted whilst half-asleep :) and have since deleted. I was thinking exclusively about protein-coding genes, in which the first exon is the translational start of the protein. However, exon can also be defined as "what's left in the mature RNA after splicing" and may have nothing to do with protein coding. In any case, TSS is not equivalent.

ADD REPLYlink written 5.8 years ago by Neilfws48k
1

Sorry to be a mosquito, but I'll still argue that even in protein coding genes, the first exon is not defined by the translation start. An exon is (and should only ever be) defined by "what's left in the mature RNA after splicing", and (for instance) the "spliced bits" in the 5'UTR of the human ALAS1 gene are still called "exons."

Exons are defined by the splicing machinery RNA processing machinery, not the translational machinery.

I've struck through "splicing machinery" because we have cases like XBP1, which is post-transcriptionally processed by ERN1 that splits one exon, into two -- and I don't think anyone would call ERN1 part of any splicing machinery.

All that having been said -- do you have any references where people are actually going by the definition you are proposing?

 

ADD REPLYlink written 5.8 years ago by Steve Lianoglou5.0k

I'm not proposing a definition, I'm using sloppy language which you are doing a great job of making less sloppy. I agree, exons are not "defined" by translational machinery. I guess I was trying to keep things simple in the context of the original question, to which the answer is "no, TSS is not first exon".

P.S. in my day, i.e. about 20 years ago, we remembered which were introns and which were exons by "exons are expressed". This newfangled definition of "exons are what's left after splicing" does not sit well in my old brain at all :)

ADD REPLYlink modified 5.8 years ago • written 5.8 years ago by Neilfws48k

>to which the answer is "no, TSS is not first exon".

Please give an example in which the TSS is not the start of the first exon. I have shown below that for all the major mouse annotations TSS == start of first exon. I would say biochemically the reason for this, at least for protein coding mRNAs, is the 5' cap which gets attached at transcription initiation to the 5' end and is necessary for export, translation (splicing?) etc... 

 

ADD REPLYlink written 5.8 years ago by Ido Tamir5.1k

circular RNAs might be a counter example

ADD REPLYlink written 5.8 years ago by Ido Tamir5.1k

How I wish I could turn back time and start again with my answers; it appears I'm just confusing myself and everyone else :)

OK: if we are including UTRs in the definition of exon and if we're assuming that transcript starts in the UCSC database really are transcript starts (I have always wondered how many are experimentally-determined) then yes - the TSS is equivalent to the first position in the first exon.

If you're an old fart like me who was taught that exon = expressed region - which is now incorrect - then TSS is nothing to do with exons at all.

I hope this helps.

ADD REPLYlink written 5.8 years ago by Neilfws48k

The sequence ontology also defines the exon as:

A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing.

But of course one thing I learned in biology that there are always exceptions as Steve Lianoglou points out.

ADD REPLYlink written 5.8 years ago by Istvan Albert ♦♦ 84k

How does one computationally find the TSS of a gene?  What if there are multiple genes involved and you must resort to computational measures (not experimental measures like RACE)

ADD REPLYlink written 5.8 years ago by NHEJ320

So far as I know, whilst there are computational methods for TSS prediction (which you can find by web/literature search), only experimental methods such as RACE provide this information.

ADD REPLYlink written 5.8 years ago by Neilfws48k
0
gravatar for Ido Tamir
5.8 years ago by
Ido Tamir5.1k
Austria
Ido Tamir5.1k wrote:
mysql --user=mm9 --host=genome-mysql.cse.ucsc.edu -A

mysql>use mm9
mysql> select COUNT(*) from refGene where txStart != substring_index(exonStarts, ",", 1);
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
1 row in set (0.25 sec)

mysql> select COUNT(*) from refGene where txStart = substring_index(exonStarts, ",", 1);
+----------+
| COUNT(*) |
+----------+
|    34025 |
+----------+
1 row in set (0.33 sec)

mysql> select COUNT(*) from refGene;
+----------+
| COUNT(*) |
+----------+
|    34025 |
+----------+
1 row in set (0.18 sec)

mysql> select COUNT(*) from ensGene where txStart != substring_index(exonStarts, ",", 1);
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
1 row in set (0.36 sec)

mysql> select COUNT(*) from knownGene  where txStart != substring_index(exonStarts, ",", 1);
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
1 row in set (0.31 sec)

 

CAVEAT: direction - !

ADD COMMENTlink modified 5.8 years ago • written 5.8 years ago by Ido Tamir5.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 663 users visited in the last hour