I have protein sequences dataset which are in fasta format. I have to find non redundant sequences from this data set. That is my aim. I have found the pairwise sequence similarity percentage and stored the result in excel sheet. My professor told me to use R programming for doing hierarchical clustering (single linkage method). I don’t want to use any software for this. I have to create a dendogram also. How can I do hierarchical clustering of protein sequences using R programming? Could you give R script for this?
I would like to get the R script for
1) Reading excel file 2) Hierarchial clustering (single linkage) 3) Phylogenetic analysis 4) Creating dendogram.
Please help me.