Question: hierarchical cluster of drug treatment after correlation analysis
gravatar for limin201709
9 months ago by
limin20170910 wrote:


I am doing a correlation between one drug treatment and several other drug treatments, So I got a dataframe would be like the colname are the drug name and rowname are gene name, values are correlation coefficient, When I performed the hierarchical clustering on the data frame, because I want to see if there are any common gene have the same expression or if there are specific genes in one drug treatment, which methods should I use, euclidean or other, and average linkage or complete?

Thanks alot Min

correlation R cluster • 266 views
ADD COMMENTlink modified 9 months ago by Kevin Blighe35k • written 9 months ago by limin20170910
gravatar for Kevin Blighe
9 months ago by
Kevin Blighe35k
Republic of Ireland
Kevin Blighe35k wrote:

Check the distribution of your data first by generating a histogram. If it looks like that typical 'bell' curve (binomial distribution), then use Euclidean distance. If not, then you may consider correlation dissimilarities via 1 minus Spearman correlation. You could also transform your data to the Z-scale, in which case it would most likely then represent a binomial curve and, in following, you could then use Euclidean distance. If your data is some other weird type of non-negative and/or ordinal data, then you may consider Manhattan or Canberra distance.

In terms of the linkage metric to use, you have more liberty to choose one metric over another, i.e., more liberty than you do for the distance metric. Ward's Linkage (ward.D2) usually makes a tree more interpretative (visually) than other metrics due to the way that it merges branches based on 'minimal variance'.


ADD COMMENTlink written 9 months ago by Kevin Blighe35k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 534 users visited in the last hour