Question: Hierarchical clustering evaluation
gravatar for teabonng
8 days ago by
teabonng10 wrote:


I clustered more than 10,000 genes based on gene expression profile under different treatment conditions. I used the dynamic tree cut package in R. However, it produced around 90 clusters. How do I evalute the clustering results with such a high number of clusters?

ADD COMMENTlink modified 8 days ago by Jean-Karim Heriche16k • written 8 days ago by teabonng10

How do you know it is a high number of clusters? And I assume you mean 10,000 sequences.

If you know which genes it are you can check if different genes are in the same cluster. If so, this is not what you want so you need to change parameters.

ADD REPLYlink modified 8 days ago • written 8 days ago by gb180

The output of the dynamictree cut package in R is more than 90 clusters. I am clustering more than 10,000 genes based on their gene expression across various treatments.

I cannot check if the genes in the same cluster are different because that is my objective, to learn which genes behave the same way

ADD REPLYlink written 7 days ago by teabonng10
gravatar for Jean-Karim Heriche
8 days ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche16k wrote:

The way to deal with large numbers of anything is to have a computer do the work. The question you should ask yourself is what are you trying to assess ? There are many ways to evaluate a clustering result but they depend on the context and the goal of the clustering. If you have anything resembling a ground truth, then use it. You can also use feature enrichment analysis (e.g. any annotation you may have) to assess whether the clusters make sense in the context of the experiments. Otherwise you can compute various indices such as the silhouette.

ADD COMMENTlink written 8 days ago by Jean-Karim Heriche16k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1479 users visited in the last hour