Question: Ensembl Exon Phase Notation
gravatar for HAlit
7.4 years ago by
HAlit20 wrote:


I would like to retrieve some exon sequences, translate them to amino acid sequences and then blast against some proteome.

I am working with exon sequences from Ensembl.

Ensembl uses something called phase to note the codons interrupted by introns, as follows:

Let N denote codon base that belongs to our exon of interest, # trailing codon bases in the same exon, and x intron.

Exon start phase: 0 - no interruption. NNNxxxxxxNNN###NNN###

Exon start phase: 1 - first codon's first base is in the previous exon. NxxxxxxxNN###NNN###NNN

Exon start phase: 2 - first codon's first two bases are in the previous exon. NNxxxxxxxxN##NNN###NNN

In addition to start phase, there is also an end phase, which works similarly.

Exon end phase:1 - last codon's last base is in the next exon. NNN###NNxxxxxxxN

Exon end phase:2 - last codon's last two bases are in the next exon. NNN###NxxxxxxNN

I assume these descriptions are correct - please let me know if they are not.

I downloaded phase information using BioMart to later map them back to the exon sequences and remove these interrupted codons. The problem is that BioMart provides single phase information, which I guess is the start phase. Does anyone know why the end phase is missing?

Thank you

ensembl exon • 3.9k views
ADD COMMENTlink modified 7.0 years ago by Biostar ♦♦ 20 • written 7.4 years ago by HAlit20
gravatar for Matt LaFave
7.4 years ago by
Matt LaFave290
San Diego, CA
Matt LaFave290 wrote:

You're correct in your interpretation of the descriptions, and in assuming that the "phase" listed in BioMart is the start phase (You'll also run into a phase labeled -1, which I believe means that the start of that exon is non-coding).

I'm not entirely sure why end phase is missing, but if I had to guess, it's because it's essentially redundant information. If, for whatever reason, you find that you need to have both the start and end phase of a given exon, you can determine the end phase by looking at the start phase of the downstream exon. Hope the helps!

ADD COMMENTlink written 7.4 years ago by Matt LaFave290

Thanks Matt. Indeed, I do realize that it's essentially redundant information provided the phase of the downstream exon. However, since I am interested in performing an exon-centric analysis, I would be happy to carry out my objective without bothering with the downstream exon. I guess it is just missing - strange.

Regarding -1, you are right, it's for UTR exons.

ADD REPLYlink written 7.4 years ago by HAlit20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 636 users visited in the last hour