Bio.Phylo.Paml.Codeml'S Results Parser Quietly Fails To Read All The Data
1
7
Entering edit mode
10.3 years ago
a1ultima ▴ 840

Biopython comes with methods to interface with the PAML package for phylogenetic analysis.

In particular I am using Bio.Phylo.PAML to run analyses using PAML's codeml.exe program which in my case does Ka/Ks (dN/dS) ratio analysis on pairs of orthologous gene sequences.

After running the analysis using results = cml.run() I can see it has successfully generated result.out files that look about right. Most importantly is the final line of the file that I need to parse into Python:

t= 0.2173  S=   703.9  N=  1489.1  dN/dS=  0.2247  dN = 0.0344  dS = 0.1529

What I am most interested in is dN/dS = 0.2247

According to Biopython's PAML wiki this value can be obtained from Python by doing results = cml.run() which generates a dictionary with a set of values I am interested in after running the analysis. The wiki claims I can find the values I need in a key called 'parameters'. But this only returns one of the values I need t= 0.2173, look:

>>> results.values()
['Fcodon', 'One dN/dS ratio for branches, ', '4.7b', {0: {'description': 'one-ratio', 'parameters': {'t': 0.1982}}}, {'htlv': {}, 'stlv': {}}]

Notice, that my parameters key only contains the t= 0.2173 and has omitted S= 703.9 N= 1489.1 dN/dS= 0.2247 dN = 0.0344 dS = 0.1529

Could anybody with codeml experience explain to me why the parser fails to yield most of the parameters (values) I am interested in?

Thank you

biopython python codeml parser read • 4.7k views
ADD COMMENT
0
Entering edit mode

Been hacking away at this problem for a while now, what is quite mysterious is that the PAML manual is given as version 4.7a (2013 May) whilst the codeml program I am using states it is PAML 4.7b (2013 Sep)... hmm perhaps this could be a very new bug caused by Biopython not yet being updated to deal with the new codeml version?

ADD REPLY
1
Entering edit mode

Could be a bug - why don't you report this to Biopython with a sample file and snippet of code to show the problem? https://github.com/biopython/biopython/issues

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Did you report a bug? Could you post a link to it?

ADD REPLY
0
Entering edit mode

Yep here (sorry for the late reply):

https://github.com/biopython/biopython/issues/483

ADD REPLY
5
Entering edit mode
10.0 years ago
AW ▴ 350

Hi,

You can obtain omega values for each branch using the following commands:

from Bio.Phylo.PAML import codeml

results = codeml.read(paml_outputfile)

print results["NSsites"][0]["parameters"]["omega"]

This gives you a list of omega (dn/ds) for each branch

I am using version 4.7b and biopython-1.6.3

Hope that helps!

A

ADD COMMENT

Login before adding your answer.

Traffic: 1498 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6