hierarchical cluster of drug treatment after correlation analysis
1
1
Entering edit mode
6.1 years ago
limin201709 ▴ 10

Hi,

I am doing a correlation between one drug treatment and several other drug treatments, So I got a dataframe would be like the colname are the drug name and rowname are gene name, values are correlation coefficient, When I performed the hierarchical clustering on the data frame, because I want to see if there are any common gene have the same expression or if there are specific genes in one drug treatment, which methods should I use, euclidean or other, and average linkage or complete?

Thanks alot Min

R cluster correlation • 1.1k views
ADD COMMENT
1
Entering edit mode
6.1 years ago

Check the distribution of your data first by generating a histogram. If it looks like that typical 'bell' curve (binomial distribution), then use Euclidean distance. If not, then you may consider correlation dissimilarities via 1 minus Spearman correlation. You could also transform your data to the Z-scale, in which case it would most likely then represent a binomial curve and, in following, you could then use Euclidean distance. If your data is some other weird type of non-negative and/or ordinal data, then you may consider Manhattan or Canberra distance.

In terms of the linkage metric to use, you have more liberty to choose one metric over another, i.e., more liberty than you do for the distance metric. Ward's Linkage (ward.D2) usually makes a tree more interpretative (visually) than other metrics due to the way that it merges branches based on 'minimal variance'.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1263 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6