Question: Statistical significance in phylogenies
0
gravatar for ceruleanivy
5 weeks ago by
ceruleanivy20
ceruleanivy20 wrote:

I have constructed a distance matrix in order to produce a phylogenetic tree for 10 species in R package 'phangorn' and I would like to know how can I calculate p-values for significantly different species based purely on phylogenetic data. I would appreciate some insight, especially by the ones who have tried anything similar with 'phytools'.

next-gen R genome • 117 views
ADD COMMENTlink modified 5 weeks ago by Manvendra Singh1.9k • written 5 weeks ago by ceruleanivy20
1

Can you clarify exactly what you mean by "calculate p-values for significantly different species?" Are you looking to calculate some P-value associated with the distance between two species or are you looking for statistical support for species being clustered together in the tree? If the latter then Manvendra's answer below is the correct one. Bootstrap values aren't a p-value but they are a support value for a given internal node in the phylogenetic tree. If however, you are looking to calculate a statistical support value for two species being separated from one another that's a different matter entirely and would involve constructing one or more phylogenetic trees for an alternative hypothesis and doing tests on those trees. Like for instance the Approximately Unbiased test.

ADD REPLYlink written 5 weeks ago by Dan Gaston6.6k

I think you got it correct, I would like to receive some sort of metric to help me quantify the relationship between two species in the context of statistical significance.

ADD REPLYlink written 5 weeks ago by ceruleanivy20

Well, I offered two different things you might be trying to do. And they are two very different things. Similarly, with your response to Manvendra's answer, I'm still not clear exactly what you want to do. It seems like you really want to calculate p-values for all species pairs in your tree. Keep in mind that a phylogenetic tree gives you a lot of information that is dependent on one another. In the simplest case, you have two bits of information, one of which is the topology and the other is the branch lengths. The measure of relatedness between any two species in a tree that is typically used is simply the sum of branch lengths between two species, which gives you an evolutionary distance metric.

I think we still need more detail about the question you are asking and trying to answer with these P-values. P-values fall out of specific tests done to answer specific questions. You need a good idea of what your null hypothesis actually is that you are testing against.

ADD REPLYlink written 4 weeks ago by Dan Gaston6.6k
1
gravatar for Manvendra Singh
5 weeks ago by
Manvendra Singh1.9k
Berlin, Germany
Manvendra Singh1.9k wrote:

you can try Bootstrapping, you can choose number of bootstraps, with this you can observe how many resampling would give the similar tree you are expecting. consequently , you can calculate p-values too

there is nice package in R that does the needful

its called pvclust its here

ADD COMMENTlink written 5 weeks ago by Manvendra Singh1.9k

Thanks, do you know which function will give the me the lowest possible p value between two species ? For example in a population of 10 that will eventually lead to a 10x10 matrix of p values.

ADD REPLYlink written 5 weeks ago by ceruleanivy20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 478 users visited in the last hour