Where To Obtain A Species Tree For Phylogenetic Tree Reconciliation
1
0
Entering edit mode
11.7 years ago

Hi all,

I am not into the field of phylogeny, but I need some advice for a particular task in my research. My excuses beforehand if my question is naive.

Briefly, I downloaded orthologs of a protein sequence from OMA Browser, and I need to obtain a phylogenetic tree of the fetched orthologs. What I have done so far is a multiple sequence alignment (MSA), followed by some MSA curation and, finally, I have built a gene tree using PhyML 3.0.

Still, I'd like to perform a final gene/species tree reconciliation step. I know that many algorithms exist for such a purpose, but to be honest I do not know where to get the species tree from. In practice, I need a species tree that, potentially, includes all species in OMA Browser. In other words, were can I find a comprehensive species tree? Which one would you trust the most?

Thanks a lot!

phylogeny • 4.0k views
ADD COMMENT
1
Entering edit mode

@miquel I am not really familiar with OMA, but it appears that they have a "representative" sample of the entire tree of life. If you are looking for a "species tree of life", I do not think you'll be able to find that. For that matter, I do not think such a tree exists. The gene trees are confusing enough as it is

ADD REPLY
1
Entering edit mode
11.7 years ago
DG 7.3k

It really depends on the question you are trying to ask. OMA contains orthologs from Bacteria, Archaea, and Eukaryotes. No GOOD reference tree exists for all three groups together, and even for just Eukaryotes there are still major bits of the tree that aren't very well resolved. The trees can differ from a reference species tree at the gene level for any number of true biological reason (different gene histories, lateral gene transfer, etc) as well as for artifactual ones. For more limited groups you may be able to get fairly good reference tress by searching the literature (Mammals, Vertebrates, Metazoa for instance). For other questions you may want to compare to results from multi-gene concatenated phylogenies.

Generally I'm not in favour of trying to force/reconcile a gene tree with a species tree. It is forcing the data in to a preconceived idea of what it should look like. Maybe if you can give some more detail about the question you are asking the data we can offer further suggestion.

ADD COMMENT
0
Entering edit mode

Hi Dan, thanks for your helpful response. At the end, I want to do coevolution analysis of pairs of residues in a protein sequence. The method I'm using (CoMap) requires an accurate phylogenetic tree. I just thought that reconciling the gene tree (from the MSA) with the species tree would give me more accurate topology and distances. Is this right? If not, what would you do?

ADD REPLY
1
Entering edit mode

I would stick to my gene tree. I would argue that you are not going to get a more accurate tree (for that protein) any other way. You could (an I would) build the same tree using different algorithms and confirm that the trees are congruent. At that point, your tree is as "accurate" as it likely to get!

ADD REPLY
1
Entering edit mode

Exactly, and especially if you are doing co-evolution analysis and require an accurate phylogenetic tree, than the gene tree is what you want. Make sure you pick appropriate models and good methods in order to have the best topology and branch length estimates. Because the gene tree can legitimately differ from the species tree, correcting the topology based on the species tree actually makes your phylogeny less accurate, not more. If you have to do lots of trees FastTree 2 tends to give pretty decent estimates. Otherwise I recommend RAxML. For all around performance something like LG tends to give the best performance for models. There are ways (slower) of getting even more accurate trees but their implementations are highly specialized and specific for every protein.

ADD REPLY
0
Entering edit mode

Hi Dan, this is clarifying. Thanks! I've implemented a pipeline using MrBayes, but I'll try RAxML as well. Thanks again.

ADD REPLY
0
Entering edit mode

No problem. It is often a good idea to do both an ML and a Bayesian tree anyway, at least for support values on nodes and just to make sure in general they are both getting approximately the same tree topology.

ADD REPLY
0
Entering edit mode

Got it! Thanks a lot.

ADD REPLY

Login before adding your answer.

Traffic: 2411 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6