Question: Hierarchical Clustering In R
0
Sanju0 wrote:

Hi all,

I have to perform hierarchical clustering in R. I tried the following code to import data from excel file and I created a diagonal matrix.

``````n_seq <- 15
mat <- matrix(NA, ncol=n_seq, nrow=n_seq)
for (idx in 1:n_seq) {
mat[idx,idx] <- 0.0
}
for(idx in 1:(n_seq-1) ) {
intemp <- read.xls("C://clustal.xls", sheet = idx );
mat[(1+idx):n_seq,idx] <- intemp[1:(n_seq-idx), 11]
}
``````

After this , when I use hclust function I am getting the following error.

``````hclust(mat)
Error in if (n < 2) stop("must have n >= 2 objects to cluster") :
argument is of length zero
``````

R clustering • 7.2k views
modified 5.6 years ago by Mo920 • written 9.1 years ago by Sanju0
2

As it stands this is an R usage question, more suited to StackOverflow than here. Please indicate the relevance to bioinformatics.

2

Your matrix object might be null. Also make sure you have Perl installed. `read.xls()` runs perl in the backend.

1

This is off topic imho. It's a pure R question.

Also, check the contents of your matrix, mat. It may not contain what you think it should contain.

8
David Quigley11k wrote:

Use dist() to calculate the distance matrix. Then feed that into hclust(). Also, you seem to be working a lot harder than you have to.

1) Save your excel sheet as a tab-delimited text file called foo.txt with the format:

``````PROBE    P1    P2    P3
sample1    1    2    3
sample2    3    4    6
sample3    6    2    6
sample4    3    4    3
``````

2) Call

``````x = read.table("foo.txt", sep='\t', header=T, row.names=1)
plot( hclust( dist( x ) ) )
``````

@David I have multiple sheets in my excel file. I think, *.txt doesn't support excel file with multiple sheets. Any other methods?

1
Mo920 wrote:

if you have multiple sheets you can do as follows

```library(XLConnect) yourM <- loadWorkbook(system.file("your data.xlsx", package = "XLConnect")) Yourdata = readWorksheet(yourM, sheet = getSheets(yourM))```

first of all, try to check whether you imported your data correctly. is it representative ? if so, then you can follow below

The following is the basic to do the analysis , however, you can always play with different linkage method etc.

```d <- dist(as.matrix(your data))

hc <- hclust(d)

plot(hc)```

It is also a good idea to paste a part of your data so that we can import it and check whether something is wrong with your data or the way you import it is not good etc.