TSS / TTS in Ensembl gene annotation?
1
0
Entering edit mode
8.3 years ago

Hello, I have one question about a gene annotation I downloaded recently in gff3 format. Below is an abbreviated example containing the first few lines of this file:

##gff-version 3
# Generated on Tue Nov 27 19:25:49 2012
# UCSC table file ./ucsc_tables/hg19/ensGene.txt
chr1    ensGene    gene       11869    14412    .    +    .    Name=..
chr1    ensGene    ncRNA    11869    14409    .    +    .    Name=..
chr1    ensGene    exon       11869    12227    .    +    .    Name=..
chr1    ensGene    exon       12613    12721    .    +    .    Name=..
..
..
chr1    ensGene    gene       14363    29806    .    -    .    Name=..
chr1    ensGene    ncRNA    14363    29370    .    -    .    Name=..
chr1    ensGene    exon       14363    14829    .    -    .    Name=..
..
..

As shown above, for each gene, there is an arbitrary number of exons listed for it.

My question: Is it correct to assume, that the start and end coordinates of a listed gene represent the TSS and TTS?

I need these two properties to measure the distance to certain alternative splice events, which I have computed with MISO (unfortunately, the MISO output doesn't provide these two properties)

Best regards

ensembl gene-annotation gff • 11k views
ADD COMMENT
1
Entering edit mode

My old question on this subject may help you, with adjustments for your genome of interest. I include some scripting to grab TSS coordinates from Ensembl GTF or via their Perl API. You will need to consider the strand the annotation is assigned to, to use that annotation coordinates to generate a useful TSS value.

ADD REPLY
6
Entering edit mode
8.3 years ago
Emily 23k

The start coordinate of forward strand genes and the end coordinate of negative strand genes will represent the TSS of the most 5' transcript of the gene. Other transcripts of the gene will have different TSSs. To get all TSSs, you should use the cDNA features in the file.

ADD COMMENT
0
Entering edit mode

Thank you very much for your help, for some reason I never considered the cDNA features in this file, but it actually makes perfect sense :)

ADD REPLY
0
Entering edit mode

Hi Emily, I have annotation as "gene", "transcript" and "exon". Should I consider TSS based on transcript start or gene start?

ADD REPLY
0
Entering edit mode

TSS is the start of the transcript.

ADD REPLY

Login before adding your answer.

Traffic: 1963 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6