Question: Understanding PAML results between random-sites and clade model C.
Hello all, I am trying to make sense of some unusual results I obtained for my dataset. I am testing for positive selection in several genes for my species between two clades that come from two different habitats (sunny and shady). Since sunny species diverged after and evolved from shady species, I expected evidence of pos sel in sunny species. When I run random-sites models (m2a vs m1, m8 vs m7, and m8 vs m8a) I get strong evidence for positive selection in most genes for species belonging to the shady habitat (not what I expected, but that's fine). Note: I tested these datasets as separate phylogenies - 1 with just species from the sunny habitat and 1 with just shady species (2 separate multiple sequence alignments).

However, I also wanted to test for divergent selection patterns using the whole phylogeny so I ran the same genes but using the combined datatset and set my sunny species as the foreground. My CmC analysis instead found significant divergent selection, but the shady species (the ones that showed pos sel using random-sites models) were under strong purifying selection (the sunny species were also purifying but much less constrained, the wD values were much higher for them).

How do I make sense of this? I would have expected, since the shady species showed pos sel for more genes in random-sites models I would have expected that if divergent selection occurred they should still have the higher wD values too but CmC placed them as more purifying. Is it because using the phylogeny showed the whole tree was purifying that somehow the pos sel got hidden in my background species? It feels like one set of models says yes there is pos selection, and the clade model says no it's just divergent and instead one is under stronger purifying selection.

Any tips on how to interpret this would be much appreciated!

