Question: heat map clustering
gravatar for Nick
12 months ago by
Nick0 wrote:

Hello everyone,

During RNA-seq analysis I faced a problem, need to perform hierarchical clustering heatmap and got the following result by using following code

var_genes <- apply(logCounts , 1, var)
select_var <- names(sort(var_genes, decreasing = TRUE))[1:i]
highly_variable_lcpm <<- logCounts[select_var,]
par(mfrow=c(1,2), mar=c(5,4,10,2))

mypalette <- brewer.pal(11,"RdYlBu")
morecols <- colorRampPalette(mypalette)

heatmap.2(highly_variable_lcpm, col=rev(morecols(50)) ,offsetRow=0, offsetCol = -0.2, cexCol = 0.6, trace="none", main=stri,ColSideColors=colors,scale="row")

enter image description here

So the question is, how can I change my plot to cluster my genes( should be three green and three purple in a row together)?

P.S I tried to use fewer samples but it didn't help Thanks.

heatmap rna-seq R • 458 views
ADD COMMENTlink modified 12 months ago by ahmad mousavi450 • written 12 months ago by Nick0

Your genes are already clustered, as indicated by the dendrogram, at left. However, there does not appear to be any discernible pattern of expression. To reveal more patterns of expression, I would actually include more genes in the heatmap.

Other things that you can try:

  • use different distance metrics
  • use different agglomeration functions
  • use different linkage metrics

I go over some of these, here: A: How to cluster the upregulated and downregulated genes in heatmap?

Also, what is the source of your data? - just genes that have high variance among your 6 samples? Is there are a particular reason for showing these genes and not those that are statistically significantly differentially expressed?

ADD REPLYlink written 12 months ago by Kevin Blighe51k
gravatar for ahmad mousavi
12 months ago by
ahmad mousavi450
Royan Institute, Tehran, Iran
ahmad mousavi450 wrote:


For log2 from count matrix try this :

      pheatmap(log2(exp+1) ,show_rownames = F)
      pheatmap(log2(exp+1),show_rownames = F, color=greenred(75))

you can use following code for heatmap of the best 40 genes based on DESeq2 result

  #Heatmap 40 top genes
      rld = rlogTransformation(cds)
      mat = assay(rld)[ head(order(res$padj),40), ] # select the top 30 genes with the lowest padj
      mat = mat - rowMeans(mat) # Subtract the row means from each value
      # Optional, but to make the plot nicer:
      df =[,c("condition")]) # Create a dataframe with a column of the conditions
      colnames(df) = "Condition" # Rename the column header
      rownames(df) = colnames(mat) # add rownames
      # and plot the actual heatmap

      fnh40 <- paste("Heatmap_Top40_",fn,sep="")

      pheatmap(mat, annotation_col=df)
      pheatmap(mat,annotation_col=df, color=greenred(75))
ADD COMMENTlink modified 12 months ago • written 12 months ago by ahmad mousavi450
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1260 users visited in the last hour