The best way to use PAML to analyze genes found only in some species on my tree
0
0
Entering edit mode
9.0 years ago
seb85il • 0

Hi all,

I am new to using PAML and want to use it for analysis of large number of genes found in a group of bacterial strains. As first step I have wanted to obtain just one value per gene per tree (model = 0; NSsites = 0) for analysis. My problem is that my tree includes all investigated species, while some genes are found only in a subset of strains, as a result Codeml can't calculate tree-wide omega since it doesn't have information for the whole tree. What can be the best approach to circumvent such problem:

  1. Create pairwise dN/dS values for my alignment using yn00 and just find their average instead of finding tree-wide dN/dS
  2. Allow CodeML to create it own tree for each gene and then calculate omega. In this case - how much bias it can introduce when comparing between genes, since sequence alignment for each gene most probably would generate different tree.

In addition, I have noticed that if I am taking a file with aligned sequences and just change their order (w/o re-aligning or something) - I get different branch specific omegas when running it through Codeml with the same initial tree. Since Codeml seems to use a tree for guiding itself and it didn't change, I don't understand why the order of the sequences in the alignment file matters. And if it does - what is the preferable order of the sequences?

Thank you all for your help,

Evgeni

dN-dS CodeML PAML alignment • 2.8k views
ADD COMMENT

Login before adding your answer.

Traffic: 2721 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6