Question: heat map clustering
gravatar for Nick
21 months ago by
Nick0 wrote:

Hello everyone,

During RNA-seq analysis I faced a problem, need to perform hierarchical clustering heatmap and got the following result by using following code

var_genes <- apply(logCounts , 1, var)
select_var <- names(sort(var_genes, decreasing = TRUE))[1:i]
highly_variable_lcpm <<- logCounts[select_var,]
par(mfrow=c(1,2), mar=c(5,4,10,2))

mypalette <- brewer.pal(11,"RdYlBu")
morecols <- colorRampPalette(mypalette)

heatmap.2(highly_variable_lcpm, col=rev(morecols(50)) ,offsetRow=0, offsetCol = -0.2, cexCol = 0.6, trace="none", main=stri,ColSideColors=colors,scale="row")

enter image description here

So the question is, how can I change my plot to cluster my genes( should be three green and three purple in a row together)?

P.S I tried to use fewer samples but it didn't help Thanks.

heatmap rna-seq R • 666 views
ADD COMMENTlink modified 21 months ago by ahmad mousavi480 • written 21 months ago by Nick0

Your genes are already clustered, as indicated by the dendrogram, at left. However, there does not appear to be any discernible pattern of expression. To reveal more patterns of expression, I would actually include more genes in the heatmap.

Other things that you can try:

  • use different distance metrics
  • use different agglomeration functions
  • use different linkage metrics

I go over some of these, here: A: How to cluster the upregulated and downregulated genes in heatmap?

Also, what is the source of your data? - just genes that have high variance among your 6 samples? Is there are a particular reason for showing these genes and not those that are statistically significantly differentially expressed?

ADD REPLYlink written 21 months ago by Kevin Blighe63k
gravatar for ahmad mousavi
21 months ago by
ahmad mousavi480
Royan Institute, Tehran, Iran
ahmad mousavi480 wrote:


For log2 from count matrix try this :

      pheatmap(log2(exp+1) ,show_rownames = F)
      pheatmap(log2(exp+1),show_rownames = F, color=greenred(75))

you can use following code for heatmap of the best 40 genes based on DESeq2 result

  #Heatmap 40 top genes
      rld = rlogTransformation(cds)
      mat = assay(rld)[ head(order(res$padj),40), ] # select the top 30 genes with the lowest padj
      mat = mat - rowMeans(mat) # Subtract the row means from each value
      # Optional, but to make the plot nicer:
      df =[,c("condition")]) # Create a dataframe with a column of the conditions
      colnames(df) = "Condition" # Rename the column header
      rownames(df) = colnames(mat) # add rownames
      # and plot the actual heatmap

      fnh40 <- paste("Heatmap_Top40_",fn,sep="")

      pheatmap(mat, annotation_col=df)
      pheatmap(mat,annotation_col=df, color=greenred(75))
ADD COMMENTlink modified 21 months ago • written 21 months ago by ahmad mousavi480
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1763 users visited in the last hour