what does the .eigenval file stand for in plink 1.9
1
0
Entering edit mode
9.3 years ago
summeryhx • 0

Hi,all

I got the .eigenval and .eigenvec file with the --pca code in plink1.9

Is there any one can tell me what the data of the .eigenval stand for ,is it the variance?or standard deviation?

Also, is there an option to output the variance explained by each PC? Thank you.

And is there any rule on how to choose the eigenvecs as covariates according to the eigenval file?

plink eigenval • 8.8k views
ADD COMMENT
2
Entering edit mode
9.3 years ago

I am not a plink user but a quick look at the documentation told me that the .eigenvec file contains the requested number of principal components (PCs) and the .eigenval file contains the corresponding eigenvalues, one per line. The eigenvalues tell you how much variation is explained by the associated PC. The total variance of the data is the sum of the variances of the individual PCs i.e. the sum of the elements on the diagonal of the covariance matrix which is also the sum of its eigenvalues. Therefore the fraction of variance explained by a PC is the ratio of the sum of the eigenvalue associated with this PC to the sum of all eigenvalues. To select how many PCs to use, you can plot the variance explained by each PC in decreasing order (scree plot). There's often an elbow separating the most important PCs from the less important ones. A widely used rule in PCA is therefore to use the PCs to the left of the elbow.

ADD COMMENT
0
Entering edit mode

Thank you for your explanation, it is very helpful. but still one thing confusing me. when I did the PCA by R, the total variances usually equal to the number of PCs, such as the example as the following:

                                  Comp.1    Comp.2    Comp.3     Comp.4
Standard deviation     1.5748783 0.9948694 0.5971291 0.41644938
Proportion of Variance 0.6200604 0.2474413 0.0891408 0.04335752
Cumulative Proportion  0.6200604 0.8675017 0.9566425 1.00000000

It gives Standard deviation (sd) instead of variance, so you have to square sd to get the variance.

total variance=(1.5748783^2+ 0.9948694^2+ 0.5971291^2+ 0.41644938^2)=4

but my .eigenval file of the plink is as following ,which not as the rule above (the total variance not equal to the number of PCs, 20), would you help me explain that? Thank you

20.0134
2.98845
2.32333
1.94295
1.93421
1.91117
1.88628
1.86544
1.85781
1.84763
1.76204
1.5532
1.3277
1.1808
1.14857
1.13482
1.13316
1.12439
1.1194
1.11312
ADD REPLY
1
Entering edit mode

How many PCs did you request and what's the size of your covariance matrix? My guess is that you have n>20 and you only got the first 20 eigenvalues corresponding to the first 20 PCs. In PCA, the data is often standardized first. In that case, the sum of the eigenvalues equals the number of variables since all variables have a variance of 1.

ADD REPLY

Login before adding your answer.

Traffic: 3314 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6