Question: hierarchical cluster of drug treatment after correlation analysis
gravatar for limin201709
22 months ago by
limin20170910 wrote:


I am doing a correlation between one drug treatment and several other drug treatments, So I got a dataframe would be like the colname are the drug name and rowname are gene name, values are correlation coefficient, When I performed the hierarchical clustering on the data frame, because I want to see if there are any common gene have the same expression or if there are specific genes in one drug treatment, which methods should I use, euclidean or other, and average linkage or complete?

Thanks alot Min

correlation R cluster • 509 views
ADD COMMENTlink modified 22 months ago by Kevin Blighe54k • written 22 months ago by limin20170910
gravatar for Kevin Blighe
22 months ago by
Kevin Blighe54k
Kevin Blighe54k wrote:

Check the distribution of your data first by generating a histogram. If it looks like that typical 'bell' curve (binomial distribution), then use Euclidean distance. If not, then you may consider correlation dissimilarities via 1 minus Spearman correlation. You could also transform your data to the Z-scale, in which case it would most likely then represent a binomial curve and, in following, you could then use Euclidean distance. If your data is some other weird type of non-negative and/or ordinal data, then you may consider Manhattan or Canberra distance.

In terms of the linkage metric to use, you have more liberty to choose one metric over another, i.e., more liberty than you do for the distance metric. Ward's Linkage (ward.D2) usually makes a tree more interpretative (visually) than other metrics due to the way that it merges branches based on 'minimal variance'.


ADD COMMENTlink written 22 months ago by Kevin Blighe54k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 834 users visited in the last hour