Hello!! I work in breast cancer and I have RNA-seq data from 60 patients. I classified the samples into intrinsic subtypes by immunohistochemistry and I have been using DEseq2 to perform differentially expression analysis between subtypes. I would like to do a global analysis. I mean, I would like to see which samples cluster together according to the gene expression, something unsupervised but using the whole transcriptome. I already used PAM50 to assign samples into intrinsic subtypes according to gene expression but I would like to see what samples cluster together and what characterized each cluster and from there I would like to do the differentially expression analysis…what tools can I use?
Have you seen the clustering examples in the DESeq2 vignette or workflow?
For 60 samples, I would apply the varianceStabilizingTransformation (see vignette), then explore with the plots we show in the vignette. You can use built in R tools to cluster, for example:
mat <- assay(vsd) hc <- hclust(dist(t(mat))) plot(hc) cutree(hc, k=3)
If you get the latest version of R (3.3) and the latest version of DESeq2 (1.12) you can use an even faster version of VST I just implemented:
vsd <- vst(dds)