Question: Bio.Phylo.Paml.Codeml'S Results Parser Quietly Fails To Read All The Data
5
gravatar for a1ultima
5.3 years ago by
a1ultima710
London
a1ultima710 wrote:

Biopython comes with methods to interface with the PAML package for phylogenetic analysis.

In particular I am using Bio.Phylo.PAML to run analyses using PAML's codeml.exe program which in my case does Ka/Ks (dN/dS) ratio analysis on pairs of orthologous gene sequences.

After running the analysis using results = cml.run() I can see it has successfully generated result.out files that look about right. Most importantly is the final line of the file that I need to parse into Python:

t= 0.2173  S=   703.9  N=  1489.1  dN/dS=  0.2247  dN = 0.0344  dS = 0.1529

What I am most interested in is dN/dS = 0.2247

According to Biopython's PAML wiki this value can be obtained from Python by doing results = cml.run() which generates a dictionary with a set of values I am interested in after running the analysis. The wiki claims I can find the values I need in a key called 'parameters'. But this only returns one of the values I need t= 0.2173, look:

>>> results.values()
['Fcodon', 'One dN/dS ratio for branches, ', '4.7b', {0: {'description': 'one-ratio', 'parameters': {'t': 0.1982}}}, {'htlv': {}, 'stlv': {}}]

Notice, that my parameters key only contains the t= 0.2173 and has omitted S= 703.9 N= 1489.1 dN/dS= 0.2247 dN = 0.0344 dS = 0.1529

Could anybody with codeml experience explain to me why the parser fails to yield most of the parameters (values) I am interested in?

Thank you

python parser biopython codeml read • 2.6k views
ADD COMMENTlink modified 4.8 years ago by Biostar ♦♦ 20 • written 5.3 years ago by a1ultima710

Been hacking away at this problem for a while now, what is quite mysterious is that the PAML manual is given as version 4.7a (2013 May) whilst the codeml program I am using states it is PAML 4.7b (2013 Sep)... hmm perhaps this could be a very new bug caused by Biopython not yet being updated to deal with the new codeml version?

ADD REPLYlink modified 5.3 years ago • written 5.3 years ago by a1ultima710
1

Could be a bug - why don't you report this to Biopython with a sample file and snippet of code to show the problem? https://github.com/biopython/biopython/issues

ADD REPLYlink written 5.3 years ago by Peter5.8k

Don, see issues being solved here: https://github.com/biopython/biopython/issues/483#issuecomment-78052449

ADD REPLYlink modified 2.1 years ago • written 5.3 years ago by a1ultima710

Did you report a bug? Could you post a link to it?

ADD REPLYlink written 5.2 years ago by Peter5.8k

Yep here (sorry for the late reply):

https://github.com/biopython/biopython/issues/483

ADD REPLYlink written 4.1 years ago by a1ultima710
5
gravatar for AW
5.0 years ago by
AW350
United Kingdom
AW350 wrote:

Hi,

You can obtain omega values for each branch using the following commands:

from Bio.Phylo.PAML import codeml

results = codeml.read(paml_outputfile)

print results["NSsites"][0]["parameters"]["omega"]

This gives you a list of omega (dn/ds) for each branch

I am using version 4.7b and biopython-1.6.3

Hope that helps!

A

ADD COMMENTlink written 5.0 years ago by AW350
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1578 users visited in the last hour