SNP standardization and SNP-SNP (inverse) covariance matrix
1
3
Entering edit mode
8.2 years ago
dinoh ▴ 30

Hello computational biologists!

I am a statistician trying to get into biostatistics.

My questions are regarding feasibility, usefulness and relevance of computing variance-covariance matrix of SNP data.

I have three questions:

  1. My limited knowledge about SNPs is that it is categorical and not ordinal data. And yet, people seem to standardize it SNP data. How well accepted is this practice? Am I missing something?
  2. Suppose we have standardized SNP matrix M. Authors of the same paper compute a covariance matrix: X = M'M/n. How useful is this outside of eigenanalysis? Are the covariance values useful for any other downstream analysis?
  3. I would like to try to compute the inverse covariance matrix, Omega = X^{-1} = (M'M/n)^{-1}. Would this be of any interest from biology standpoint?

If there is a good reference for understanding more about the biology from statistical perspective, I would appreciate any pointers.

Thank you!

SNP • 2.3k views
ADD COMMENT
1
Entering edit mode
8.2 years ago

My 2p:

1)

SNPs is that it is categorical and not ordinal data

I'm not sure this is correct. A SNP in an individual represents the number of reference alleles it has. So 0, 1, 2 for diploid organism or 0,1,2,3,4 for tetraploids. In the paper you link it says: Let C(i,j) be the number of variant alleles for marker j, individual i. (Thus for autosomal data we have C(i,j) is 0,1 or 2.)

2)

Are the covariance values useful for any other downstream analysis?

I think they can be used for computing linkage disequilibrium

3)

compute the inverse covariance matrix

Not sure, I'm tempted to say yes though...!

ADD COMMENT

Login before adding your answer.

Traffic: 2255 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6