representing PCs as marginal distribution on x and Y acis
1
0
Entering edit mode
6 weeks ago
Yogesh • 0

Hi

I came across a figure attached herewith and want to try the plotting style. But I can't get the grasp of what units are plotted as marginal distribution on X and Y. I have PCA plot and PC loadings, but how to plot them as this marginal distribution? Can someone explain? For the clarity colour codings in blue, cyan green and yellow are different samples and control, CS are treatment groups

enter image description here

PCAtool Deseq2 Pcaplot PCA • 375 views
ADD COMMENT
1
Entering edit mode
6 weeks ago
Papyrus ★ 2.9k

There are no "additional" units. For a PCA plot you do a scatter plot of the PC values (e.g. PC1 and PC2) on the X and Y axes, and the marginal plots are just the densities, histograms of those X and Y values that you plot in the scatter plot. You can try the ggMarginal function from the ggExtra package (see this page for examples) to produce a scatter plot with marginal distribution plots.

ADD COMMENT
0
Entering edit mode

hi thank you for the response. i tried it but the histograms do not have metadata information like group type and sample type. how do i add that

below is the code that i am using..

plotPCA <- function (object, intgroup=c("condition"), ntop = 500, returnData = FALSE) 
{
   .
.
.
.

    p <- ggplot(data = d, aes_string(x = "PC3", y = "PC4", color = "group")) + 
        geom_point(size = 3) + xlab(paste0("PC3: ", round(percentVar[3] * 
        100), "% variance")) + ylab(paste0("PC4: ", round(percentVar[4] * 
        100), "% variance")) + coord_fixed()
       p_with_marginals <- ggMarginal(p, type = "histogram",groupfill=TRUE)

  print(p_with_marginals)

}

print(plotPCA(vsd, intgroup=c("Group", "type")))
ADD REPLY
1
Entering edit mode

Here's some reproducible code to show the input data:

library(ggplot2)
library(ggExtra)

df <- matrix(rnorm(600),nrow = 50, ncol = 12)
group1 <- rep(c("a","b"), each = 6)
group2 <- rep(c("one","two","three"),each = 4)

pca <- prcomp(t(df), center = T, scale. = T)
varexp <- (pca$sdev^2 / sum(pca$sdev^2)) * 100

data <- data.frame(PC1 = pca$x[,1],
                   PC2 = pca$x[,2],
                   group1 = group1,
                   group2 = group2)

Then, an initial ggMarginal plot would me something like this:

plot1 <- ggplot(data, aes(y = PC2, x = PC1, color = group1, shape = group2)) +
  geom_point() + xlab(paste0("Dim1 (var: ",round(varexp[1],1),"%)")) +
  ylab(paste0("Dim2 (var: ",round(varexp[2],1),"%)")) + theme_bw()
plot.m1 <- ggMarginal(plot1, type = "density", groupFill = T, groupColour = T)
plot.m1

If you want to have different marginal distributions on both axes, you can do a solution such as proposed here (there may be other solutions for this):

plot1 <- ggplot(data, aes(y = PC2, x = PC1, color = group1, shape = group2)) +
  geom_point() + xlab(paste0("Dim1 (var: ",round(varexp[1],1),"%)")) +
  ylab(paste0("Dim2 (var: ",round(varexp[2],1),"%)")) + theme_bw()
plot.m1 <- ggMarginal(plot1, type = "density", groupFill = T, groupColour = T)

plot2 <- ggplot(data, aes(y = PC2, x = PC1, color = group2, shape = group2)) +
  geom_point() + xlab(paste0("Dim1 (var: ",round(varexp[1],1),"%)")) +
  ylab(paste0("Dim2 (var: ",round(varexp[2],1),"%)")) + theme_bw()
plot.m2 <- ggMarginal(plot2, type = "density", groupFill = T, groupColour = T)


plot.m1$grobs[plot.m1$layout$name == "topMargPlot"] <- 
  plot.m2$grobs[plot.m2$layout$name == "topMargPlot"]

plot.m1
ADD REPLY

Login before adding your answer.

Traffic: 3157 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6