Correlation plot sized by another variable
1
0
Entering edit mode
14 months ago
Lucy ▴ 80

Hi,

Sorry for the broad question, but I am trying to generate a correlation plot that is coloured by the correlation values but where the tiles (preferably circles) are sized by another variable. I would like the correlation matrix to be ordered by hclust.

I have been playing around with corrplot, ggcorrplot and ggplot2, but I have struggled to find a way to generate this plot.

Any help would be appreciated.

Best,

Lucy

ggplot2 correlation plot corrplot • 486 views
0
Entering edit mode

Hi,

António

0
Entering edit mode
ggcorrplot(as.matrix(cormat), type = "upper",
ggtheme = ggplot2::theme_classic(), hc.order = TRUE) +
scale_fill_gradient(breaks = c(0, 1), limit = c(0, 1))


This works well but unfortunately I can't work out how to size the squares by another matrix.

Same with the following:

corrplot(as.matrix(cormat), type = "upper", order = "hclust",
tl.col = "black", tl.srt = 90, cl.lim = c(0, 1), is.corr = FALSE,
col = col2(200))


On the other hand, I have converted the matrix into long format and run the following:

ggplot(cluster_markers, aes(x = base_cluster, y = comparator_cluster, fill = correlation, size = number_of_markers)) +
geom_point()


But I can't work out how to order the above using hclust, as now my data is in long format.

0
Entering edit mode

If you have the order from hclust, you can use that to turn the cluster names into a factor using that order as the levels.

0
Entering edit mode

Thank you - I tried this but it didn't turn out right! I separately calculated a distance matrix from cormat and then performed hclust on this. Then took the row order and used this to order the base_cluster factor in cluster_markers. But the points were no longer restricted to the upper triangle of my plot.

0
Entering edit mode

For your ggplot command what code are you using to set the factor order?

0
Entering edit mode

Before running ggplot, I did:

cluster_markers$base_cluster <- factor(cluster_markers$base_cluster, levels = hclust_order, ordered = TRUE)


Where hclust_order is the cluster names in the order from hclust.

0
Entering edit mode

All of this should be possible via corrplot(). I implement it here: https://github.com/kevinblighe/scDataviz/blob/master/R/plotSignatures.R

0
Entering edit mode

Thank you, which argument is it that you use to size by another matrix?

1
Entering edit mode

Oh, by 'another' matrix? - for that, proceed with the ggplot2 approach. lattice::levelplot could also do it.

2
Entering edit mode
14 months ago

This is a solution with base ggplot2:

  data_wide <- cor(longley)
myorder <- colnames(data_wide)[hclust(dist(data_wide))$order] library(reshape2) library(ggplot2) #prepare data to plot data_long <- melt(data_wide,value.name="correlation") data_long$Var1 <- factor(data_long$Var1,levels=myorder) data_long$Var2 <- factor(data_long$Var2,levels=myorder) # just a simulated additional variable to scale by data_long$sizedby <- rnorm(dim(data_long))

myplot <- ggplot(data_long,aes(x=Var1,y=Var2,color=correlation,size=sizedby)) + geom_point() + scale_size_area()
myplot


Im not too familiar with corrplot, but it is at least possible to provide a significance matrix of p-values for the respective correlations via the p.mat parameter and specify the consequences via sig.level, insig and pch in case you asked the question because you wish to display significant correlations only.

I hope this is what you were looking for

Thias

0
Entering edit mode

Thank you Thias, the ggplot approach works really well. One question: what would be the easiest way to just visualise the upper triangle of the plot as they are correlations, so the upper and lower triangles provide identical information.