Hi, I am doing hierarchical clustering on TCGA data from several tissues. I perform the hierarchical clustering several times, based on different omics data, and would like to choose the best hierarchical clustering. I am looking for a hierarchical equivalent to looking at cluster homogeneity / separation / silhouette score that is used when evaluating non-hierarchical clustering solutions. Is anyone familiar with such generalizatoin of homogeneity / separation / silhouette score to tree structures?
The best approach to take for this would be by bootstrapping the clustering step, as per pvclust. This, through bootstrapping, derives probabilities for each branch point in your dendrogram. I aso put some short code here on how you can do that (but for an unrelated topic): A: how to make bootstrapped tree in PVCLUST package with SNP genotyping data?
For other types of comparisons, take a look at dendextend: Comparing two dendrograms. In particular look at the very simplistic entanglement metric
You have not indicated that you're specifically looking for ways to determine the ideal cluster solution to a dataset, for which there are many other methods, some which you have already mentioned.