Interpreting theta value
0
0
Entering edit mode
3.5 years ago
yolek64754 ▴ 30

Hi everyone,

I have been computing Watterson theta for the first time and I am using ANGSD . For my understanding this value can be from 0 to 1. However on ANGSD use likelihood to run it and this is the results I have when I compute theta on Chr22 on one bamfile:

(indexStart,indexStop)(firstPos_withData,lastPos_withData)(WinStart,WinStop)   Chr     WinCenter       tW      tP      tF      tH      tL      Tajima  fuf     fud     fayh    zeng    nSites
(0,32860927)(16050032,51244307)(0,51244307)     22      25622153        31780.907015    31780.907015    31780.907015    31780.907015    31780.907015    -nan    -nan    -nan    -nan    -nan    32860927

So what I am interested in is tW (theta Watterson).. but this value means no sense to me. I think I do have to divide it by the nSites which would give me tW=0.0009671336118. Does it make sense to do that ?

I also have the results per sites on a log scale that looks like that:

#Chromo Pos Watterson   Pairwise    thetaSingleton  thetaH  thetaL
22  16050032    -7.634670   -7.634670   -7.634670   -7.634670   -7.634670
22  16050033    -7.634670   -7.634670   -7.634670   -7.634670   -7.634670
22  16050034    -7.634670   -7.634670   -7.634670   -7.634670   -7.634670
22  16050035    -7.634670   -7.634670   -7.634670   -7.634670   -7.634670
22  16050036    -7.634670   -7.634670   -7.634670   -7.634670   -7.634670

Is it better do compute the overall Wtheta using those values ? If yes, should I still do the sum and dividing them by the total number of sites ?

Sorry for the newbie questions, new on this field. Thanks a lot

theta statistics popgen angsd • 1.9k views
ADD COMMENT
0
Entering edit mode

WIKI: In population genetics, the Watterson estimator is a method for describing the genetic diversity in a population. . So what does it mean on a genomic coordinate level? Maybe theta is each site's contribution. How does it vary across a chromosome?

ADD REPLY
0
Entering edit mode

I think in this way they use the prior knowing the reference true site ? I have no clear idea of it either but this is how I understand it. This is how the value fluctuates, doesn't really make sense to me either:

-7.634670
-7.634670
-7.634670
-7.634670
-7.634670
-7.634670
-7.634670
-7.634670
-7.634670
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-8.327583
-9.020618
-9.020618
-8.327583
-9.020618
-9.020618
-9.020618
-9.020618
ADD REPLY
0
Entering edit mode

Hello,

I'm facing the same problem, I have the output of ANGSD and I was wandering how to get the Pi and Theta from the results. Trying to make the average gives me -inf result. Have you find a solution?

Many thanks

ADD REPLY

Login before adding your answer.

Traffic: 2685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6