4.9 years ago by

EMBL Heidelberg, Germany

I am not a plink user but a quick look at the documentation told me that the .eigenvec file contains the requested number of principal components (PCs) and the .eigenval file contains the corresponding eigenvalues, one per line. The eigenvalues tell you how much variation is explained by the associated PC. The total variance of the data is the sum of the variances of the individual PCs i.e. the sum of the elements on the diagonal of the covariance matrix which is also the sum of its eigenvalues. Therefore the fraction of variance explained by a PC is the ratio of the sum of the eigenvalue associated with this PC to the sum of all eigenvalues. To select how many PCs to use, you can plot the variance explained by each PC in decreasing order (scree plot). There's often an elbow separating the most important PCs from the less important ones. A widely used rule in PCA is therefore to use the PCs to the left of the elbow.