Question

Are k-means methods for sample clustering and/or gene expression clustering?

0

Entering edit mode

3.4 years ago

ecg1g15 ▴ 30

I have a RNA seq dataset for which I wanted to know the "ideal" number of clusters to perform the statistical analysis on. After applying the elbow, silhouette and NBclust, I got them all agreeing in 3 main clusters.

This has been done on my dataset already transformed following https://www.statsandr.com/blog/clustering-analysis-k-means-and-hierarchical-clustering-by-hand-and-in-r/

df <- scale(assay(vsd))

I am happy with my 3 clusters of samples, However, I have also seen k-clusters are also used to define gene expression profiles across the samples. C: How to make k-means clustering plot for relative expression?

So just to confirm, as if these three "ideal" clusters are referring to the group sample similarity? - ie there are three main clusters of samples based on how similar their transcriptome is (which correlated with the Euclidean distance's dendrogram) or

Are these three groups the three main expression profiles across all my samples - i) upregulation on group A, down regulation on group B, and other. - this also coincides with my dataset heat map.

NbClust Kmeans rnaseq cluster • 875 views

ADD COMMENT • link 3.4 years ago by ecg1g15 ▴ 30