Question

dN/dS in a tree (multiple genes)

5

Entering edit mode

7.9 years ago

BlastedBadger ▴ 160

Hi,

this is kind of an emergency, since I have to handle my MSc internship report in one week and there is an analysis that would be a very nice addition to it.

What I need

I am using the Ensembl database, and want to know which branches in the gene trees are subject to selection. Since it is very computationally intensive (there are more than 22,000 gene trees in Ensembl), I didn't want to run PAML (codeml program) myself, but instead use existing available results (Selectome). Unfortunately, the Ensembl version used in Selectome is old and making the connection between every branch between the old and the more recent versions gives very little match.

So that was why I am asking my question:

What I think I should do

my supervisor then suggested that I use pairwise dN/dS (already available in Ensembl) and make my own soup to infer it on a per branch basis:

How I understand it, I would need to know the dN and the dS, not only the dN/dS ratios, and that seems already like a complication.

Example with a tree of 3 genes: something like getting the dN and dS between the 2 closest genes; then between the most distant and each of the 2 others. Compute dN and dS on the branch leading to the 2 closest genes by substraction.

This sounds odd to me. Is it even an approximate method? I think it's equivalent to calculating the dN/dS ratio in the inner branch by inferring the ancestral sequence by parsimony. How bad would that be compared to the likelihood computation performed by codeml during the branch-site analysis?

Thanks a lot for your insights.

dN/dS dNdS KaKs Phylogenetics PAML • 3.2k views

ADD COMMENT • link 17 months ago by BlastedBadger ▴ 160

score 0 · Answer 1 · 2022-10-11

0

Entering edit mode

19 months ago

chparada ▴ 70

Hi there,

I have the same problem. Where you able to figure it out? I have a large phylogeny and PAML is choking with it!

Let me know if you got something.

Cheers

Camilo

ADD COMMENT • link 19 months ago by chparada ▴ 70

0

Entering edit mode

PAML choking? Like running forever? Which model are you using? The free-ratio model is heavy indeed. I don't know what your goal is, but it might be more realistic anyway to use a constant omega per branch, but site-variable. Alternatively, maybe you would be lucky with HyPhy?

ADD REPLY • link 17 months ago by BlastedBadger ▴ 160