Question: Dn/Ds Ratio - Paml : How Is It Possible To Obtain Negative Values For Dn Or Ds ?
1
gravatar for Francois Olivier Hébert
7.3 years ago by
Quebec
Francois Olivier Hébert280 wrote:

Hi,

I have a set of 383 coding sequences in which there is a whole bunch of SNPs. I only have two species (2 populations representing incipient species). I used PAML to estimate the dn/ds ratio for each sequence. For some of the sequences, I get negative values either for dn, ds or dn/ds.

I know from the FAQs PDF file that comes with PAML's latest distribution that -1.000 means infinity. So if you have a dn > 0 and a ds = 0, when you divide dn by ds, you get -1.0000, i.e infinity. But when I get a result such as :

Nei & Gojobori 1986. dN/dS (dN, dS)
(Note: This matrix is not used in later ML. analysis.
Use runmode = -2 for ML pairwise comparison.)

allele1             
allele2             -0.6057 (-1.0000 1.6509)

How is it possible to have an infinite value for dn OR ds ? dn = -1.0000 or ds = -1.0000 seems really strange to me. How should I interpret these negative values ? My guess would be that it means that there is no synonymous or non-synonymous sites in the sequence. Thus, PAML divides 0 by 0 and it returns -1.0000. Example: if there is 0 synonymous mutations and 0 synonymous sites, the ratio of synonymous mutation / synonymous sites equals infinity (-1.0000) because it's a division by 0.

Am I wrong ? If not, should I simply replace the value -1.0000 by 0 each time dn or ds equals -1.0000 ?

Thank you for any help !

paml selection • 5.2k views
ADD COMMENTlink modified 7.2 years ago by Liam Thompson140 • written 7.3 years ago by Francois Olivier Hébert280

that's a really strange error! I'm sorry I have no clue to solve your problem. your explanation seems logically correct but that would mean that your sequences have 0 non-synonymous sites and how could it be possible?

ADD REPLYlink written 7.3 years ago by Martombo2.5k
1
gravatar for Liam Thompson
7.3 years ago by
Liam Thompson140
Gothenburg, Sweden
Liam Thompson140 wrote:

From my understanding of the output of the program, which is limited, an infinity values means means that there are one or more non-synonymous substitutions and no synonymous substitutions in the branch. Obviously an infinity value makes the data difficult to analyse and thus an LRT should ideally be performed (e.g. using the lnL between branched site Model A and site Model 1, and branched site Model B and site Model 3). I hope this sheds some light.

ADD COMMENTlink written 7.3 years ago by Liam Thompson140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1417 users visited in the last hour