Question: Understanding standardized genotype matrix
3.4 years ago by
United Kingdom
This might be really basic, but I am trying to understand why a standardized genotype matrix is constructed this way.

In a standardized genotype matrix we have each ij element:

w ij = (x ij - 2 p i ) / SQROOT( 2p i (1 - p i )

Where x ij is the number of copies of the reference allele for the i SNP of the j individual and p i is the frequency of the reference allele.

I understand that the number of reference allele that any individual can have is influenced by the allele frequency of that SNP, so standardizing it by the allele frequency seems a good idea. But why is it standardized in this way? Is it a common standardization? Does it have a name?

3.4 years ago by
United States
Not a geneticist, but using your notation, for autosomes 2p is the expected (mean) value of x and sqroot(2p(1-p)) will be the standard deviation of x. So this is a z-score. The standardized values (wij) will have mean zero and variance 1 for each SNP, I guess a nice property for some calculations.

