Question: PCA analysis with R
1
gravatar for manjumoorthy95
4 weeks ago by
manjumoorthy9510 wrote:

I am using autolplot function from the ggfortify library in R. Autoplot serves cluster analysis too. I wanted to know what is the algorithm used by the autoplot for finding the 1st two principal components?

autoplot pca R ggfortify • 206 views
ADD COMMENTlink modified 4 weeks ago by Kevin Blighe42k • written 4 weeks ago by manjumoorthy9510
1

ggfortify has vignette, Plotting PCA (Principal Component Analysis). Which part is not clear? Provide example data and code.

ADD REPLYlink modified 4 weeks ago by WouterDeCoster39k • written 4 weeks ago by zx87547.3k

autoplot(pam(iris[-5], 3), frame = TRUE, frame.type = 'norm')

This, autoplot finds the 1st two principal components on the clustered object obtained from pam(). I wanted to know what is the algorithm autoplot uses here?

ADD REPLYlink written 4 weeks ago by manjumoorthy9510
4
gravatar for Kevin Blighe
4 weeks ago by
Kevin Blighe42k
Republic of Ireland
Kevin Blighe42k wrote:

The PCA function that it uses is prcomp(), which is the same as what my own package (PCAtools) and DESeq2 use.

Yes, it is performing partitioning around medoids (PAM) and identifying X number of clusters (user pre-selects desired number as second parameter to pam()). autoplot() then performs PCA on the dataset and shades the points based on the PAM cluster assignments. Here is the proof:

g1 <- autoplot(prcomp(iris[-5]), frame = TRUE, frame.type = 'norm')
g2 <- autoplot(pam(iris[-5], 2), frame = TRUE, frame.type = 'norm')
require(grid)
require(gridExtra)
grid.arrange(g1,g2, ncol = 2)

Captura-de-tela-de-2019-04-24-08-43-19

They are the same points, but higlighted differently.

As is typical with many CRAN (and other) packages, the documentation is poor and the program functionality does not make it readily obvious what the function is doing.

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Kevin Blighe42k

I haven't often used autoplot but I didn't notice the automatic PCA on cluster objects. Indeed the doc is quite misleading. The only hint is the axis labels.

ADD REPLYlink written 4 weeks ago by Jean-Karim Heriche18k

Yes, I literally had to run the code myself to find out... something did not seem correct!

ADD REPLYlink written 4 weeks ago by Kevin Blighe42k

Thank you so much, i was so confused with the documentation. So, the method autoplot() use to find the 1st 2 principal components is by prcomp()?

ADD REPLYlink written 4 weeks ago by manjumoorthy9510

One can infer that it uses prcomp() based on my example above, yes. However, the functionality of the program should be improved as it leaves much room for doubt.

ADD REPLYlink written 4 weeks ago by Kevin Blighe42k

ok, thank you so much. The documentation was driving me crazy. Your explanation helped me a lot.

ADD REPLYlink written 29 days ago by manjumoorthy9510
1
gravatar for Jean-Karim Heriche
4 weeks ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche18k wrote:

In PCA, principal components are ordered by the fraction of variance explained (i.e. eigenvalues of the covariance matrix). If this doesn't make sense to you, please read some tutorial on PCA.

ADD COMMENTlink written 4 weeks ago by Jean-Karim Heriche18k

ok, thank you. So in the above method, the clustering is performed 1st by pam() and then the clustered data points are adjusted according to the PC1 and PC2 plotted by autoplot. right?

ADD REPLYlink written 4 weeks ago by manjumoorthy9510

If you're talking about this line:

autoplot(pam(iris[-5], 3), frame = TRUE, frame.type = 'norm')

then there's no PCA. autoplot() is a "smart" plotting function. It recognizes what objects are passed to it and calls the appropriate specialized plotting function. If you pass it an object from the cluster package, it plots the data and automatically colours points according to cluster labels. If you pass it a pca object them it will plot the data against the first two PCs.

EDIT: I am wrong. autoplot does indeed perform PCA on cluster objects. See Kevin's answer.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Jean-Karim Heriche18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 850 users visited in the last hour