Hierarchical Clustering In R
2
0
Entering edit mode
12.7 years ago
Sanju • 0

Hi all,

I have to perform hierarchical clustering in R. I tried the following code to import data from excel file and I created a diagonal matrix.

n_seq <- 15
mat <- matrix(NA, ncol=n_seq, nrow=n_seq)    
for (idx in 1:n_seq) {
  mat[idx,idx] <- 0.0
}  
for(idx in 1:(n_seq-1) ) {
  intemp <- read.xls("C://clustal.xls", sheet = idx );         
  mat[(1+idx):n_seq,idx] <- intemp[1:(n_seq-idx), 11]
}

After this , when I use hclust function I am getting the following error.

hclust(mat)
Error in if (n < 2) stop("must have n >= 2 objects to cluster") : 
argument is of length zero

How to remove this error? Please help me.

clustering r • 9.3k views
ADD COMMENT
2
Entering edit mode

As it stands this is an R usage question, more suited to StackOverflow than here. Please indicate the relevance to bioinformatics.

ADD REPLY
2
Entering edit mode

Your matrix object might be null. Also make sure you have Perl installed. read.xls() runs perl in the backend.

ADD REPLY
1
Entering edit mode

This is off topic imho. It's a pure R question.

ADD REPLY
0
Entering edit mode

Also, check the contents of your matrix, mat. It may not contain what you think it should contain.

ADD REPLY
8
Entering edit mode
12.7 years ago

Use dist() to calculate the distance matrix. Then feed that into hclust(). Also, you seem to be working a lot harder than you have to.

1) Save your excel sheet as a tab-delimited text file called foo.txt with the format:

PROBE    P1    P2    P3
sample1    1    2    3
sample2    3    4    6
sample3    6    2    6
sample4    3    4    3

2) Call

x = read.table("foo.txt", sep='\t', header=T, row.names=1)
plot( hclust( dist( x ) ) )
ADD COMMENT
0
Entering edit mode

@David I have multiple sheets in my excel file. I think, *.txt doesn't support excel file with multiple sheets. Any other methods?

ADD REPLY
1
Entering edit mode
9.2 years ago
Mo ▴ 920

If you have multiple sheets you can do as follows

library(XLConnect)
yourM <- loadWorkbook(system.file("your data.xlsx", package = "XLConnect"))
Yourdata = readWorksheet(yourM, sheet = getSheets(yourM))

First of all, try to check whether you imported your data correctly. Is it representative? If so, then you can follow below

The following is the basic to do the analysis , however, you can always play with different linkage method etc.

d <- dist(as.matrix(your data))
hc <- hclust(d)
plot(hc)

It is also a good idea to paste a part of your data so that we can import it and check whether something is wrong with your data or the way you import it is not good etc.

ADD COMMENT

Login before adding your answer.

Traffic: 1317 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6