Question: Where To Obtain A Species Tree For Phylogenetic Tree Reconciliation
gravatar for miquelduranfrigola
8.3 years ago by
miquelduranfrigola770 wrote:

Hi all,

I am not into the field of phylogeny, but I need some advice for a particular task in my research. My excuses beforehand if my question is naive.

Briefly, I downloaded orthologs of a protein sequence from OMA Browser, and I need to obtain a phylogenetic tree of the fetched orthologs. What I have done so far is a multiple sequence alignment (MSA), followed by some MSA curation and, finally, I have built a gene tree using PhyML 3.0.

Still, I'd like to perform a final gene/species tree reconciliation step. I know that many algorithms exist for such a purpose, but to be honest I do not know where to get the species tree from. In practice, I need a species tree that, potentially, includes all species in OMA Browser. In other words, were can I find a comprehensive species tree? Which one would you trust the most?

Thanks a lot!

phylogeny • 3.1k views
ADD COMMENTlink written 8.3 years ago by miquelduranfrigola770

@miquel I am not really familiar with OMA, but it appears that they have a "representative" sample of the entire tree of life. If you are looking for a "species tree of life", I do not think you'll be able to find that. For that matter, I do not think such a tree exists. The gene trees are confusing enough as it is

ADD REPLYlink written 8.3 years ago by Whetting1.5k
gravatar for DG
8.3 years ago by
DG7.2k wrote:

It really depends on the question you are trying to ask. OMA contains orthologs from Bacteria, Archaea, and Eukaryotes. No GOOD reference tree exists for all three groups together, and even for just Eukaryotes there are still major bits of the tree that aren't very well resolved. The trees can differ from a reference species tree at the gene level for any number of true biological reason (different gene histories, lateral gene transfer, etc) as well as for artifactual ones. For more limited groups you may be able to get fairly good reference tress by searching the literature (Mammals, Vertebrates, Metazoa for instance). For other questions you may want to compare to results from multi-gene concatenated phylogenies.

Generally I'm not in favour of trying to force/reconcile a gene tree with a species tree. It is forcing the data in to a preconceived idea of what it should look like. Maybe if you can give some more detail about the question you are asking the data we can offer further suggestion.

ADD COMMENTlink written 8.3 years ago by DG7.2k

Hi Dan, thanks for your helpful response. At the end, I want to do coevolution analysis of pairs of residues in a protein sequence. The method I'm using (CoMap) requires an accurate phylogenetic tree. I just thought that reconciling the gene tree (from the MSA) with the species tree would give me more accurate topology and distances. Is this right? If not, what would you do?

ADD REPLYlink written 8.3 years ago by miquelduranfrigola770

I would stick to my gene tree. I would argue that you are not going to get a more accurate tree (for that protein) any other way. You could (an I would) build the same tree using different algorithms and confirm that the trees are congruent. At that point, your tree is as "accurate" as it likely to get!

ADD REPLYlink written 8.3 years ago by Whetting1.5k

Exactly, and especially if you are doing co-evolution analysis and require an accurate phylogenetic tree, than the gene tree is what you want. Make sure you pick appropriate models and good methods in order to have the best topology and branch length estimates. Because the gene tree can legitimately differ from the species tree, correcting the topology based on the species tree actually makes your phylogeny less accurate, not more. If you have to do lots of trees FastTree 2 tends to give pretty decent estimates. Otherwise I recommend RAxML. For all around performance something like LG tends to give the best performance for models. There are ways (slower) of getting even more accurate trees but their implementations are highly specialized and specific for every protein.

ADD REPLYlink written 8.3 years ago by DG7.2k

Hi Dan, this is clarifying. Thanks! I've implemented a pipeline using MrBayes, but I'll try RAxML as well. Thanks again.

ADD REPLYlink written 8.3 years ago by miquelduranfrigola770

No problem. It is often a good idea to do both an ML and a Bayesian tree anyway, at least for support values on nodes and just to make sure in general they are both getting approximately the same tree topology.

ADD REPLYlink written 8.3 years ago by DG7.2k

Got it! Thanks a lot.

ADD REPLYlink written 8.3 years ago by miquelduranfrigola770
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1941 users visited in the last hour