Question

Control Pheatmap cluster_col

0

Entering edit mode

3.4 years ago

lgspeight ▴ 10

I have two columns or variables. When I use cluster_col = T, my pheatmap is difficult to interpret trying to distinguish salinity and population. When I use cluster_col = F, pheatmap automatically clusters population first, then salinity. I want it to first cluster salinity and within each salinity cluster each population. How can I do this? I have already tried rearranging the order of population and salinity.

df <- data.frame(colData(dds)[,c("salinity","population")])
rownames(df) <- colnames(dds)
colnames(df) <- c("salinity","population")

pheatmap(
  assay(vsd)[lfcorder[1:10],], 
  cluster_rows=F, 
  show_rownames=T,
  cluster_cols=T,
  annotation_col=df
)

enter image description here

pheatmap(
  assay(vsd)[lfcorder[1:10],], 
  cluster_rows=F, 
  show_rownames=T,
  cluster_cols=F,
  annotation_col=df
)

enter image description here

pheatmap • 1.3k views

ADD COMMENT • link updated 3.4 years ago by Ram 45k • written 3.4 years ago by lgspeight ▴ 10

Ram · Answer 1 · 2022-06-14

0

Entering edit mode

3.4 years ago

Ram 45k

Have you tried rearranging the columns of assay(vsd) so they're in the desired order? You'll need to sort df, pick its rownames and pass it to the assay(vsd) like so:

df <- df[order(....)]
pheatmap(
  assay(vsd)[lfcorder[1:10],rownames(df)], 
  cluster_rows=F, 
  show_rownames=T,
  cluster_cols=F,
  annotation_col=df
)

EDIT: You can also sort using dplyr's arrange instead of base R's order (as you've mentioned in your comment below)

ADD COMMENT • link 3.4 years ago by Ram 45k

0

Entering edit mode

Thanks Ram!

I ended up using

df <- df %>% arrange(desc(salinity))

I think this is the same idea you were talking about.

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 3.4 years ago by lgspeight ▴ 10

0

Entering edit mode

Yep - dplyr vs base R for sorting is the difference between our approaches. Glad things worked out!

Given it was the same concept, can you please accept my answer to mark the post resolved? I'll edit it and add a statement about how dplyr can be used as well.

ADD REPLY • link 3.4 years ago by Ram 45k