Question: The best way to use PAML to analyze genes found only in some species on my tree
gravatar for seb85il
5.9 years ago by
seb85il0 wrote:

Hi all,
 I am new to using PAML and want to use it for analysis of large number of genes found in a group of bacterial strains. As first step I have wanted to obtain just one value per gene per tree (model = 0; NSsites = 0) for analysis. My problem is that my tree includes all investigated species, while some genes are found only in a subset of strains, as a result Codeml can't calculate tree-wide omega since it doesn't have information for the whole tree. What can be the best approach to circumvent such problem:

1. Create pairwise dN/dS values for my alignment using yn00 and just find their average instead of finding tree-wide dN/dS
2. Allow CodeML to create it own tree for each gene and then calculate omega. In this case - how much bias it can introduce when comparing between genes, since sequence alignment for each gene most probably would generate different tree.

In addition, I have noticed that if I am taking a file with aligned sequences and just change their order (w/o re-aligning or something) - I get different branch specific omegas when running it through Codeml with the same initial tree. Since Codeml seems to use a tree for guiding itself and it didn't change, I don't understand why the order of the sequences in the alignment file matters. And if it does - what is the preferable order of the sequences?

Thank you all for your help,

codeml paml alignment dn/ds • 2.3k views
ADD COMMENTlink written 5.9 years ago by seb85il0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2363 users visited in the last hour