Question: how to return the componenets from PCA back to original variables?
1
gravatar for M K
4.0 years ago by
M K510
United States
M K510 wrote:

I used Principal Component Analysis technique (PCA) under R to reduce the number of explanatory (independent) variables in my model (i.e PCA was used for variable reduction only). After running PCA, I got the components (10 components). What I want to do know is return these components back to the original variables(i.e I want to know what are the variables inside each of these components). My original data matrix contains 35,000 rows and 500 columns.

R • 6.3k views
ADD COMMENTlink modified 4.0 years ago by Mikael Huss4.7k • written 4.0 years ago by M K510
1

What you want are probably the loadings. If you don't have access to them, you can try to calculate them manually. Correct me if I am wrong anyone, but the loadings are essentially the correlation of standardized original observations to the PCs.

ADD REPLYlink written 4.0 years ago by Damian Kao15k

Do you still have the matrix of loadings, or just the 10 PCs?

ADD REPLYlink written 4.0 years ago by John12k
1

I have both of them.

ADD REPLYlink written 4.0 years ago by M K510

Then you can certainly get very very close to your original data - and you can even interpolate data you don't have, if you wish. Of course, it depends on how much of the variance 10 PCs will explain, but it's likely to be most of it. (...right? should be in the report)

Unfortunately PCA can differ from implementation to implementation depending on how the centering is done, and some other minor details, so you really will have to take a proper look at the code used to generate the loadings and PCs if you didn't use the generic R prcomp().

A good place to start is: http://stats.stackexchange.com/questions/229092/how-to-reverse-pca-and-reconstruct-original-variables-from-several-principal-com

and an R specific demo here: http://stats.stackexchange.com/questions/57467/how-to-perform-dimensionality-reduction-with-pca-in-r/57478#57478

ADD REPLYlink written 4.0 years ago by John12k
5
gravatar for Matt Shirley
4.0 years ago by
Matt Shirley9.4k
Cambridge, MA
Matt Shirley9.4k wrote:

I don't believe this is possible. Principal components are derived from projecting the data to a vector that maximizes the spread or variance along that vector - see here mostly the visualizations. Asking which variables contributed most to this projection is a difficult question, similar to asking which points in this linear fit contribute most to the slope of a linear fit:

Which points would you pick and why? The principal components reconstruct the relationships in the data, but are derived from the data in a way that doesn't directly relate to any individual features of the data.

ADD COMMENTlink written 4.0 years ago by Matt Shirley9.4k
2
gravatar for Mikael Huss
4.0 years ago by
Mikael Huss4.7k
Stockholm
Mikael Huss4.7k wrote:

I think you are looking for the loadings. Depending on which method you used in R, these could be in the "loadings" or "rotation" slot in the object returned from the PCA routine. The loadings tell you how the original variables are weighted to form each principal component.

ADD COMMENTlink written 4.0 years ago by Mikael Huss4.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1962 users visited in the last hour