Hierarchical Clustering-Dendrogram Cut-Off
2
1
Entering edit mode
10.7 years ago
Diana ▴ 900

I have microarray data and I want to identify groups of genes that have same expression patterns across my samples. I've done average linkage hierarchical clustering in R. How can I detect which cut-off would be best for dendrogram so I have significant clusters? The height (combination similarity) of my dendrogram goes from 0 to about 2.5 with almost all clusters of genes between the height of 0 and 0.5. How important is it to have a cut-off?

Many thanks.

clustering • 13k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
4
Entering edit mode
10.7 years ago
John ★ 1.5k

It is slighly different approach, but is useful to do bootstrap to see stablitity of the clusters. You can use R package Pvclust.

library (pvclust)

result < - pvclust(data, nboot=10000)

plot(results)

ADD COMMENT
0
Entering edit mode

pvclust is pretty sweet!

ADD REPLY
0
Entering edit mode
10.7 years ago
Vitis ★ 2.5k

This is a very hard decision to make. First, you can plot the within group sum of squares along the number of clusters to find the optimum number then cut the tree (dendrogram) accordingly. Then, there are very sophisticated dynamic tree cutting approaches being developed, for example: http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting/ These methods are more involved so you may need to read more about them.

ADD COMMENT

Login before adding your answer.

Traffic: 1566 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6