Question: Need help with codeml
gravatar for rprog008
13 months ago by
rprog00870 wrote:

Dear member,

I am performing analysis with various models of codeml to detect pattern on evolution on a disease gene set. I am performing the same analysis on 6 and 16 species of mammal, separately. Though in both the cases I am getting my gene sets are evolving under purifying selection. But while performing chi square test, p-value of 6 species (p-value=0.007) is getting significant while that of 16 species in unsignificant (p-value=0.8). Now, I am bit worried, if it is okay to report result of 6 species and i discard 16 species data. or shall i report about 16 species by statiing that these genes are evolving under weak purifying selection

Thanks in advance

codeml purifying selection • 310 views
ADD COMMENTlink modified 13 months ago by Brice Sarver3.5k • written 13 months ago by rprog00870

If the p-value is non-significant, it means you can not reject the null hypothesis - for codeml, I believe the null hypothesis is "the sequences are evolving neutrally". So you can't reject the sequences are evolving neutrally, and you can't say these genes are "evolving under weak purifying selection".

Some interesting reads:

Still Not Significant

Misuse of ‘trend’ to describe ‘almost significant’ differences in anaesthesia research

ADD REPLYlink modified 13 months ago • written 13 months ago by h.mon31k

Thanks a lot for clearing my doubt and for sharing interesting articles. :)

ADD REPLYlink written 13 months ago by rprog00870
gravatar for Brice Sarver
13 months ago by
Brice Sarver3.5k
United States
Brice Sarver3.5k wrote:

What p-value?

If you're doing the likelihood ratio test to distinguish between models with and without classes where sites can be assigned to groups with ω > 1 (e.g., following a chi-square distribution with the degrees of freedom equal to the difference in the number of free parameters between the models for most, or a mixture that results in a chi-square with one degree of freedom for the M8 vs. M8a comparison), then the p-value is simply providing evidence for selecting one model over the other.

If your best-fit model is one that supports some signature of positive selection (e.g., M8 as opposed to M7), you can then take a look at the results to see which codons have evidence for positive selection under as determined by the Naive Empirical Bayes and/or Bayes Empirical Bayes approaches.

Be aware that the one-rate models may not be the right tests for what you're actually trying to do in some cases. From the manual:

We suggest that The M0-M3 comparison should be used as a test of variable ω among sites rather than a test of positive selection. However, the model of a single ω for all sites is probably wrong in every functional protein, so there is little point of testing.

If you're explicitly looking for other flavors of selection, perhaps across loci, you may find it helpful to explore other approaches. I recommend the Datamonkey Adaptive Evolution Server.

ADD COMMENTlink written 13 months ago by Brice Sarver3.5k

Thanks Brice for the answer. I am facing one more problem. What if p-value of LRT between M1a vs M2a is unsignificant while p-value of LRT between M0-M3, M7 vs M8 and M8 vs M8a is significant. I am sorry if this seems naive question. Please suggests me

ADD REPLYlink modified 13 months ago • written 13 months ago by rprog00870

Those sets of nested models are looking at slightly different things. The M7 vs. M8 and M8 vs. M8a comparisons are the best for identifying signatures of positive selection, with the M8 vs. M8a comparison (a ω = 1 class in M8a) yielding fewer false positives.

Remember that you're selecting among models here; one model's likelihood or a comparison's p-value isn't saying which one is better for what you're trying to do. I'd suggest reading the manual or supporting papers and determining which are most suited for your purposes.

ADD REPLYlink written 13 months ago by Brice Sarver3.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1148 users visited in the last hour