Question: Dn/Ds Ratio - Paml : How Is It Possible To Obtain Negative Values For Dn Or Ds ?
gravatar for Francois Olivier Hébert
6.9 years ago by
Francois Olivier Hébert280 wrote:


I have a set of 383 coding sequences in which there is a whole bunch of SNPs. I only have two species (2 populations representing incipient species). I used PAML to estimate the dn/ds ratio for each sequence. For some of the sequences, I get negative values either for dn, ds or dn/ds.

I know from the FAQs PDF file that comes with PAML's latest distribution that -1.000 means infinity. So if you have a dn > 0 and a ds = 0, when you divide dn by ds, you get -1.0000, i.e infinity. But when I get a result such as :

Nei & Gojobori 1986. dN/dS (dN, dS)
(Note: This matrix is not used in later ML. analysis.
Use runmode = -2 for ML pairwise comparison.)

allele2             -0.6057 (-1.0000 1.6509)

How is it possible to have an infinite value for dn OR ds ? dn = -1.0000 or ds = -1.0000 seems really strange to me. How should I interpret these negative values ? My guess would be that it means that there is no synonymous or non-synonymous sites in the sequence. Thus, PAML divides 0 by 0 and it returns -1.0000. Example: if there is 0 synonymous mutations and 0 synonymous sites, the ratio of synonymous mutation / synonymous sites equals infinity (-1.0000) because it's a division by 0.

Am I wrong ? If not, should I simply replace the value -1.0000 by 0 each time dn or ds equals -1.0000 ?

Thank you for any help !

paml selection • 4.9k views
ADD COMMENTlink modified 6.9 years ago by Liam Thompson120 • written 6.9 years ago by Francois Olivier Hébert280

that's a really strange error! I'm sorry I have no clue to solve your problem. your explanation seems logically correct but that would mean that your sequences have 0 non-synonymous sites and how could it be possible?

ADD REPLYlink written 6.9 years ago by Martombo2.4k
gravatar for Liam Thompson
6.9 years ago by
Liam Thompson120
Gothenburg, Sweden
Liam Thompson120 wrote:

From my understanding of the output of the program, which is limited, an infinity values means means that there are one or more non-synonymous substitutions and no synonymous substitutions in the branch. Obviously an infinity value makes the data difficult to analyse and thus an LRT should ideally be performed (e.g. using the lnL between branched site Model A and site Model 1, and branched site Model B and site Model 3). I hope this sheds some light.

ADD COMMENTlink written 6.9 years ago by Liam Thompson120
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 789 users visited in the last hour