Question: Effect size of a SNP - contribution to genetic variance
gravatar for Peixe
4.2 years ago by
Peixe610 wrote:


I am trying to find a formal reference and more details to this equation:

enter image description here

I saw it in this Nature Genetics paper (Online Methods section), by Park et al. (2010). This equation determines the contribution of a SNP to the genetic variance of a trait or effect size (ES), taking into account its regression effect (Beta) and allele frequencies (f).

I have searched in books, articles and throughout the internet, but found nothing.

Could someone provide a more detailed reference or explanation about it?

Thank you!

ADD COMMENTlink written 4.2 years ago by Peixe610
gravatar for Collin
4.2 years ago by
United States
Collin870 wrote:

Here is my guess, that ES is the variance that can be attributed to a particular SNP when you look at the regression equation. Say you have a regression model where Y = BX + epsilon, where Y is your phenotype, B is your regression coefficient, epsilon is the random noise, and X is either 0,1, or 2 depending on the status of the SNP. ES seems to be the following ES=Var[BX]=B^2Var[X], where Var is the variance and remember when you have constant when you take the variance that you can bring it outside but the constant needs to be squared. Var[X] given a binomial distribution with 2 trials should be Var[X]=2f(1-f), where f is the minor allele frequency. So altogether you have ES=2B^2*f(1-f). Perhaps, some one with more background in genome wide association studies can give you a more definitive answer though.

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by Collin870

Hi Collin! I had already figured more or less the same: It contains everything an effect size formula should contain, however what I ideally need it would be a formal text book reference or something similar. Thanks anyway for your time!

ADD REPLYlink written 4.2 years ago by Peixe610

Hi Peixe,

In my opinion, "The effect size, as defined above, corresponds to the contribution of the locus to the genetic variance of the trait under Hardy-Weinberg equilibrium and an additive polygenic model (Park et al., 2010)" should be ES/Var(G) rather than the ES as they defined. Var(G) is the genetic variance of trait. My proof follows:

Model: phenotype value P consists of genetic value G and environmental value E. P = G + E. In total there are many susceptibility SNPs, i.e., causal SNPs. Assume these SNPs are independent. Suppose there is a susceptibility SNP having genetic value G1, where G1=beta*X, beta is the effect of this SNP on phenotype, X is the number of allele. G1 is part of G, so G = G1 + GR, R represents for remain.

The genetic variance explained by that susceptibility SNP is the coefficient of determination for the regression G=G1+GR. It can be calculated as Cov(G,G1)^2/(Var(G)Var(G1)). As G1 is part of G and SNPs are independent, in the nominator Cov(G,G1)=Cov(G1+GR,G1)=Var(G1). Thus, the coefficient of determination is equal to Var(G1)/Var(G), which is the proportion of variance of that susceptibility SNP to genetic variance.

However, Var(G1) itself is the effect size defined by Park et al. In my understanding, the definitions of ES in formula and in words are different things. Can you explain where I misunderstand their definition? Thank you.

Besides, I think effect size is a general term. It may refer to mean difference, logOR or correlation coefficient. In the paper I think the authors just define an effect size that reflects the contribution of a causal SNPs to the genetic variance.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Sunshine n Rain20

My colleague told me that I misunderstood the meaning of the contribution of the locus to the genetic variance of the trait. It is not the same with the genetic variance explained by that SNP, it is just the variance of that SNP.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Sunshine n Rain20

Hi Collin, I think the BX represents for genetic value of one causal SNP here. The ES is not the same as Var(Y).

ADD REPLYlink written 2.1 years ago by Sunshine n Rain20

Yes, one would need to add the variance of the epsilon term to get the full variance of Y, Var[Y] = Var[BX] + Var[epsilon]. But the variance of the random noise term wouldn't be considered a genetic influence on the phenotype.

ADD REPLYlink written 2.1 years ago by Collin870
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1922 users visited in the last hour