Question: Statistical significance in phylogenies
gravatar for ceruleanivy
4.0 years ago by
ceruleanivy30 wrote:

I have constructed a distance matrix in order to produce a phylogenetic tree for 10 species in R package 'phangorn' and I would like to know how can I calculate p-values for significantly different species based purely on phylogenetic data. I would appreciate some insight, especially by the ones who have tried anything similar with 'phytools'.

next-gen R genome • 1.2k views
ADD COMMENTlink modified 4.0 years ago by Manvendra Singh2.1k • written 4.0 years ago by ceruleanivy30

Can you clarify exactly what you mean by "calculate p-values for significantly different species?" Are you looking to calculate some P-value associated with the distance between two species or are you looking for statistical support for species being clustered together in the tree? If the latter then Manvendra's answer below is the correct one. Bootstrap values aren't a p-value but they are a support value for a given internal node in the phylogenetic tree. If however, you are looking to calculate a statistical support value for two species being separated from one another that's a different matter entirely and would involve constructing one or more phylogenetic trees for an alternative hypothesis and doing tests on those trees. Like for instance the Approximately Unbiased test.

ADD REPLYlink written 4.0 years ago by DG7.2k

I think you got it correct, I would like to receive some sort of metric to help me quantify the relationship between two species in the context of statistical significance.

ADD REPLYlink written 4.0 years ago by ceruleanivy30

Well, I offered two different things you might be trying to do. And they are two very different things. Similarly, with your response to Manvendra's answer, I'm still not clear exactly what you want to do. It seems like you really want to calculate p-values for all species pairs in your tree. Keep in mind that a phylogenetic tree gives you a lot of information that is dependent on one another. In the simplest case, you have two bits of information, one of which is the topology and the other is the branch lengths. The measure of relatedness between any two species in a tree that is typically used is simply the sum of branch lengths between two species, which gives you an evolutionary distance metric.

I think we still need more detail about the question you are asking and trying to answer with these P-values. P-values fall out of specific tests done to answer specific questions. You need a good idea of what your null hypothesis actually is that you are testing against.

ADD REPLYlink written 4.0 years ago by DG7.2k
gravatar for Manvendra Singh
4.0 years ago by
Manvendra Singh2.1k
Berlin, Germany
Manvendra Singh2.1k wrote:

you can try Bootstrapping, you can choose number of bootstraps, with this you can observe how many resampling would give the similar tree you are expecting. consequently , you can calculate p-values too

there is nice package in R that does the needful

its called pvclust its here

ADD COMMENTlink written 4.0 years ago by Manvendra Singh2.1k

Thanks, do you know which function will give the me the lowest possible p value between two species ? For example in a population of 10 that will eventually lead to a 10x10 matrix of p values.

ADD REPLYlink written 4.0 years ago by ceruleanivy30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1281 users visited in the last hour