Question: Phylogenetic distance betwee species
0
gravatar for Jautis
3.8 years ago by
Jautis280
United States
Jautis280 wrote:

Hi, I have a phylogenetic tree derived from genotype information from multiple individuals per species. It looks something like this: 

(((Spec1.77,(Spec1:31,Spec1:31):1.77):0,(Spec2:4.17,(Spec2.14,(Spec2:27.24,Spec2:27.24):3.14):4.17):0):0,(Spec3:1.8,(Spec3:0.4,(Spec3:0.7,(Spec3:18.3,Spec3.1:18.3):0.7):0.4):1.8):0)

 

How would I simplify this tree to only capture the distance in between species? I.e., produce a tree like (Spec1:x1,Spec2,x2),Spec3:x3? 

Is there an efficient way to do this for a large tree?

trees newick phylogenetics • 1.1k views
ADD COMMENTlink modified 3.8 years ago by jhc2.8k • written 3.8 years ago by Jautis280
2
gravatar for jhc
3.8 years ago by
jhc2.8k
Germany
jhc2.8k wrote:

Your tree resembles a regular gene tree with duplications. However, it's not clear to me if: 1) the duplicated items are always like in your example (all branches from the same species are grouped together) or 2) you could also have complex patterns like ((spAseq1, spBseq1), (spAseq2, spBseq2)). 

If 1), you just need to colapse the species-specific-nodes into a single branch, choosing a method for summarising the distances therein (i.e. max branch length, average, sum, etc). You could easily do this in a programatic way using any phyloinformatics toolkit. I use ETE, but it would also be possible with biopython (Phylo), bioperl Bio:Phylo, etc 

if 2), you would need to decompose your gene tree in all possible species subtrees. The TreeKO methodology is good for this, and I recently implemented it into ETE so it can be also used programatically. In brief, you will need to decompose your tree into multiple subtrees using the tree.get_speciation_trees() function. Then, you need to somehow make a consensus out of the resulting subtrees. For the consensus, you could just compute a distance matrix averaging the all-against-all distances observed among the species nodes, or build a consensus tree (check biopython for this).  

 

ADD COMMENTlink written 3.8 years ago by jhc2.8k

Hi, sorry for the delayed response. 

The duplicated items are individuals, not multiple genes from the species. So, for example, I have 3 individuals from species 1 which cluster together and three individuals from species 2 that cluster together, and what I want to know is the distance between species 1 and species 2. I think this would be straightforward if branch lengths of individuals within a species were the same, but they are not. 

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by Jautis280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1138 users visited in the last hour