Question: Hierarchical cluster analysis
gravatar for drshahzadbhatti
13 months ago by
drshahzadbhatti0 wrote:


I need some your valuable comments on my cluster analysis as i have very little knowledge of cluster analysis. What will be the best interpretation of it.

enter image description here

rna-seq sequence R gene • 602 views
ADD COMMENTlink modified 13 months ago • written 13 months ago by drshahzadbhatti0

To get help, please explain what your data is, what the question you're trying to address is and what analysis steps you've taken. This is in addition to providing whichever image we're supposed to look at.

ADD REPLYlink written 13 months ago by Jean-Karim Heriche17k

May kindly send me your email id. I will send the deprogram on it.



ADD REPLYlink written 13 months ago by drshahzadbhatti0

The dendrogram alone is not sufficient, we need the context, i.e. the answers to the questions I asked above. If the image doesn't show up here, upload it somewhere and give the link here. Others may be better able to answer your question than me.

ADD REPLYlink written 13 months ago by Jean-Karim Heriche17k

Shahzad, you can share images/figures by uploading them her: Then, obtain the HTML URL and paste it into a reply here.

Also, as my colleague says, you should provide more information on the programs you're using, what the experiment is about (RNA-seq? identity-by-state?; microarray?), etc.

ADD REPLYlink written 13 months ago by Kevin Blighe33k

Dear Kevin thanks for your help. I have pasted the link below. As for my experiments is concern, I want to find out the relationship of CAG and GCC repeats length with testosterone and oxytocin other parameters listed below in diabetic premature ejaculatory dysfunction patients. the parameters that i have studied is

1) body mass index (BMI) , hormonal assay including oxytocin, prolactin, Testosterone, and TSH,

2) sexual and mental performance: Sexual performance was estimated according to the International index of Erectile Function-15 (IIEF-15), which categorize on the basis of the specific score assigned for erectile function i.e. sexual desire (SD) ranging from score 2-10, intercourse satisfaction (IS) estimated from 0-15, Orgasm (OR) ranging 0-10 and overall satisfaction (OS) ranging from 2 -10 score, while erectile function (EF) can be normal 26-30, slightly impaired 17-25, appropriately impaired 11-16, severely impaired ≥11 (29).

3) Degree of depression was calculated by : Beck's Depression Inventory (BDI)

4) Premature ejaculation diagnostic tool (PEDT) was assessed.

5) IELT: Intravaginal ejaculatory latency time was calculated for premature ejaculation,

I have draw hierarchical cluster for clustering of theses parameters and need your help to elaborate it.

ADD REPLYlink modified 13 months ago by Kevin Blighe33k • written 13 months ago by drshahzadbhatti0

Hi Shahzad,

You appear to have posted the same comment 4 times, so, I have removed 3 of them (and also tidied up your comment, above). Before leaving the page after making a comment, you should check how your comment appears. If it is not formatted correctly, you can edit it.

Your study sounds very interesting. Thank you for providing the dendrogram. It is very small and I cannot read the variable names. However, I can provide the following interpretation:

1) The person who developed the dendrogram appears to have decided that there are 3 main groups in the data, as indicated by the positioning of the dotted horizontal line. These 3 groups consist of:

  • The 4 variables on the left of the dendrogram
  • The red-shaded variables
  • The blue-shaded variables

2) The 2 variables on the left of the dendrogram are almost identical to each other

3) The red- and blue-shaded groups of variables are more similar to each other than they are to the 4 variable on the left

A key point about dendrograms, Shahzad, is the height of the vertical bars and where the bars of 2 variables (or variable groups) meet. For example, I can immediately see that the 2 variables on the left are almost identical to each other because their vertical bars meet/merge at a very low height. On the other hand, the vertical bar for all 4 variables on the left merges with the other 2 main groups in the dendrogram at the maximum possible height at the top of the dendrogram, indicating that these 4 variables are very 'dissimilar', i.e., different, from these other 2 groups.

A dendrogram is usually a graphical representation of what is called a 'distance matrix', i.e., Euclidean distance from one sample to another.

ADD REPLYlink written 13 months ago by Kevin Blighe33k

You should also tell how you processed the data for clustering: did you preprocess (e.g. standardize) the variables ? what distance measure did you use ? I can see a potential issue if you went ahead and computed the Euclidean distance on the "raw" values because you have a mix of different variable types: category 1 are interval scale (i.e. numeric) data, category 2 are ordinal data (I don't know about 3-5).

ADD REPLYlink written 13 months ago by Jean-Karim Heriche17k

dear i have create a more clear image may kindly see it.

ADD REPLYlink written 13 months ago by drshahzadbhatti0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 960 users visited in the last hour