Entering edit mode

4.5 years ago

anikng
•
0

I tried a few methods on determining best k value for k-means clustering. When I applied 2 of these methods for my RNASeq datsest, I am getting k=2 or k=3. Considering the the matrix of 40000 genes with 90 samples, I was expecting higher number of k. I check it with multiple k.max value, but it remained same.Could you please give suggestion?

```
library(factoextra)
library(NbClust)
Normalized_counts_cpm<-read.csv(file = "...Normalized_counts_cpm.csv", header = TRUE, sep=",", row.names = 1)
```

With Elbow method,

```
fviz_nbclust(Normalized_counts_cpm, kmeans, method = "wss") +
geom_vline(xintercept = 4, linetype = 2)+
labs(subtitle = "Elbow method")
```

With Silhouette method,

```
fviz_nbclust(Normalized_counts_cpm, kmeans, method = "silhouette",k.max = 30)
```

Do you

a prioriexpect more?With MeV k means clustering analysis, k=10 gave me reasonable clusters.