**400**wrote:

I have RNA-seq data (FPKMs) from Cufflinks and would like to cluster it by gene and produce a heatmap.

This is my first try at using R and I have spent a LOT of time pouring over the manual/help pages and internet tutorials on how to do this.

I can now produce heatmaps using "heatmap" easily enough, my problem is that I can produce them from many different versions/transformations of my data and I cannot figure out what is going on and which heatmap is the analysis I am interested in.

What I am trying to get is a) gene names clustered by expression profile, to mine for enriched gene groups/pathways; and b) a heatmap of FPKM values, with the same gene clustering.

This is the R code: Data input/preparation

```
m <- data.frame(read.table("DMSTSC1000_notmeanctrd.txt", header=T, sep="\t"))
row.names(m) <- m$test_id
m <- m[,2:7]
m_matrix <- data.matrix(m)
```

Making Heatmap version 1:

```
heatmap(m_matrix, Colv=NA, scale="column")
```

Making Heatmap version 2. This came about because a paper described using a Pearson correlation metric with clustering, but this heatmap looks terrible, clustering appears to bear little relationship with imaged data:

```
cor_t <- cor(t(m_matrix))
distancet <- as.dist(cor_t)
hclust_complete <- hclust(distancet, method = "complete")
dendcomplete <- as.dendrogram(hclust_complete)
heatmap(m_matrix, Rowv=dendcomplete, Colv=NA, scale="column")
```

Making Heatmap version 3

```
distancem <- dist(m_matrix)
hclust_completem <- hclust(distancem, method = "complete")
dendcompletem <- as.dendrogram(hclust_completem)
heatmap(m_matrix, Rowv=dendcompletem, Colv=NA, scale="column")
```

Or, if you have code for a fourth way that you're confident about, I'd love to hear it! I tried to use pam but haven't been able to produce a heatmap from it yet.

Sorry about not uploading images, I haven't figured out how to web-host them yet.

Details: FPKM data has been log2 transformed and high outliers were capped at a maximum value (10), to increase the range of colors used for the majority of the data.

Thank you in advance for your help, it is very much appreciated!!

**320**• written 7.0 years ago by Kanne •

**400**

Maybe you should change the title since, from what I understood, it seems your problem is more about choosing clustering methods than generating and analyzing heatmaps which you seem to know how to do.

1.9kTrue, will do, thanks!

400