Phylogeny using UTR sequences: any insights?
2
0
Entering edit mode
8.9 years ago
cyril-cros ▴ 950

A rather general question:

The phylogeny of many gene families has been studied using only the coding sequences, with an analysis focused on identifying the relevant genes / aligning them / building trees and exploiting the results.

How can you adapt this process to newly annotated full length transcripts? Do introns matter?
I guess you can do the same work with full length transcripts and compare the trees you obtain with both methods, classify the genes according to the length and structure of their 3'UTR, or look for regulatory motifs. Is there anything else to do, or relevant tools?

I would be interested in any advice, article or textbook reference. This kind of phylogeny often seems to be done on virus...

RNA-Seq annotation phylogeny • 1.9k views
ADD COMMENT
2
Entering edit mode
8.8 years ago
abascalfederico ★ 1.2k

The question is interesting. However, those kind of comparisons will be unfeasible unless species are very closely related (e.g. primates) because UTRs, introns et al diverge very fast. Even if some regions remain conserved within UTRs or introns, sequence alignment would be very difficult or even impossible due to unconserved neighbor regions.

Alternatively, you could try some alignment-free approach. For instance, you could build a matrix of characters in which each character represents a given feature (e.g. conservation of a certain element, length of UTR, number of exons, etc)

ADD COMMENT
0
Entering edit mode

Thanks for the comment. I was considering olfactory receptor genes, which are badly annotated. They tend to be a bit of a mess (genomic clusters with lots of similar sequences). Zhang and Firestein wrote some papers on their CDS in 2002/2004.
An interesting question is to see if they have conserved regions in their introns/UTRs, and if they evolve at a faster pace than the CDS.

ADD REPLY
1
Entering edit mode
8.9 years ago
Asaf 10k

Introns do matter, or at least their location along the gene according to:

http://www.ncbi.nlm.nih.gov/pubmed/17495009

ADD COMMENT

Login before adding your answer.

Traffic: 2060 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6