pvclust in R
0
0
Entering edit mode
7.9 years ago

Hi, I have a binary matrix which I would like to create a dendrogram with bootstrapping values. Each row of my matrix is an dependent variable and each column is an independent variable of which I have 49 and the first column is the names of the dependent variables so I've excluded this from the clustering so NAs aren't introduced.

I've used pvclust but wanted to check I was doing the right thing. I haven't got any error messages but wanted to check my code so that it's doing what I think it's doing!

pv<-pvclust(data2[2:50], method.hclust="average",
method.dist="binary", use.cor="all.obs",
nboot=1000, parallel=FALSE, r=seq(.5,1.4,by=.1),
store=FALSE, weight=FALSE, iseed=NULL, quiet=FALSE)

plot(pv, hang=-1)

Thanks!

R pvclust clustering • 3.7k views
ADD COMMENT
0
Entering edit mode

pvclust clusters the columns of the matrix so make sure this is what you want, otherwise use t().

ADD REPLY
0
Entering edit mode

ah ok. I think I remember reading that it is the opposite from hclust. I want to cluster the columns based on the rows so I've definitely got the dataframe in the right order. I wondered more about the dist method whether it was ok to use binary and whether pvclust is suitable for a binary dataset.

ADD REPLY
0
Entering edit mode

You can use pvclust for binary data. The binary distance used is the Jaccard distance. Note also that you don't need use.cor since you're not using the correlation distance.

ADD REPLY
0
Entering edit mode

Perfect, thank you. Before writing it up I wanted to check I was using it right! Usually strange error messages are my main problem so doubted myself when there weren't any! Thanks again, Frances

ADD REPLY

Login before adding your answer.

Traffic: 1826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6