biplot scaling options (ggbiplot)
1
0
Entering edit mode
4.4 years ago
Lucy ▴ 140

Hi,

I have a prcomp object (generated using the prcomp function) and I am trying to generate a biplot using ggbiplot, however I am confused about the different scaling options and their impact on the meaning of the plot.

ggbiplot(pcobj, choices = 1:2, scale = 1, pc.biplot = TRUE, obs.scale = 1 - scale, var.scale = scale)

scale = covariance biplot (scale = 1), form biplot (scale = 0). When scale = 1, the inner product between the variables approximates the covariance and the distance between the points approximates the Mahalanobis distance.

obs.scale = scale factor to apply to observations

var.scale = scale factor to apply to variables

I am not sure which of these scaling options I should choose for a prcomp object? I have seen lots of people set obs.scale = 1 and var.scale = 1, but I don't understand the reason for this.

What does the line length and angle actually correspond to in a biplot? I get the general idea that a high value on PC1 indicates that the variable has a strong influence on PC1 whilst a small value indicates a small influence. And that if the arrow is pointing to the right, then that variable has a positive impact on the PC.

Thanks for the help!!

Best wishes,

Lucy

biplot PCA ggbiplot • 14k views
ADD COMMENT
0
Entering edit mode

Thank you for suggesting PCAtools - I will try this out.

Why do you set obs.scale and var.scale to 1 in the command above? If you change obs.scale, it seems to modify the PC1 axis, while changing var.scale modifies the loading arrows.

ADD REPLY
0
Entering edit mode

If you are unsure what are the roles of these parameters, then I would leave them at the default, or use some other PCA function that has actually passed review by a third party. ggbiplot is on neither CRAN nor BioConductor, the main R package repositories, and is therefore simply some code posted to GitHub. As I mentioned, in addition, the project seems abandoned, with the last commit >4 years ago

The actual lengths of those arrows means nothing, as far as I am aware.

ADD REPLY
0
Entering edit mode
4.4 years ago

I think that this function is still in development, or was abandoned (last commit was in 2015). Implementation of the scale parameter has no apparent effect:

library(ggbiplot)
data(wine)
wine.pca <- prcomp(wine, scale. = TRUE)

g1 <- ggbiplot(wine.pca, scale = 1, obs.scale = 1, var.scale = 1,
  groups = wine.class, ellipse = TRUE, circle = TRUE) +
  scale_color_discrete(name = '') +
  theme(legend.direction = 'horizontal', legend.position = 'top')


g2 <- ggbiplot(wine.pca, scale = 0, obs.scale = 1, var.scale = 1,
  groups = wine.class, ellipse = TRUE, circle = TRUE) +
  scale_color_discrete(name = '') +
  theme(legend.direction = 'horizontal', legend.position = 'top')


cowplot::plot_grid(g1, g2, ncol = 2)

kkkk

Without spending too much time trying to understand why the function is applying these extra calculations, I would point you to PCAtools (by Aaron and I), which is on Bioconductor. With PCAtools, your input is just a data-matrix and it will produce the same output as prcomp().

Kevin

ADD COMMENT
0
Entering edit mode

Ok thank you for the advice.

ADD REPLY

Login before adding your answer.

Traffic: 2363 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6