Question: Simple Question About Pvclust In R
1
gravatar for Max
9.8 years ago by
Max10
Max10 wrote:

Hey, im a beginner to R and trying to run pvclust so as to test a cluster solution.

I've managed to load data and run the heirachical cluster, however the code i find online for running pvclust is constantly producing errors - just wondering if someone can point out where I'm going wrong...

here is my code (data already transposed)

##loaddata

transpose <- 
  read.table("C:/Users/Tim/University/Advanced Design and Data Analysis/Assignment/transposed.csv",
   header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)

View(transpose)
hello <- hclust(dist(model.matrix(~-1 + 
  var001+var002+var003+var004+var005+var006+var007+var008+var009+var010+var011+var012+var013+var014+var015+var016+var017+var018+var019+var020+var021+var022+var023+var024+var025+var026+var027+var028+var029+var030+var031+var032+var033+var034+var035+var036+var037+var038+var039+var040+var041+var042+var043+var044+var045+var046+var047+var048+var049+var050+var051+var052+var053+var054+var055+var056+var057+var058+var059+var060+var061+var062+var063+var064+var065+var066+var067+var068+var069+var070+var071+var072+var073+var074+var075+var076+var077+var078+var079+var080+var081+var082+var083+var084+var085+var086+var087+var088+var089+var090+var091+var092+var093+var094+var095+var096+var097+var098+var099+var100+var101+var102+var103+var104+var105+var106+var107+var108+var109+var110+var111+var112+var113+var114+var115+var116+var117+var118+var119+var120+var121+var122+var123+var124+var125+var126+var127+var128+var129+var130+var131+var132+var133+var134+var135+var136+var137+var138+var139+var140+var141+var142+var143+var144+var145+var146+var147+var148+var149+var150+var151+var152+var153+var154+var155+var156+var157+var158+var159+var160+var161+var162+var163+var164+var165+var166+var167+var168+var169+var170+var171+var172+var173+var174+var175+var176,
   transpose)) , method= "ward")

plot(hello, main= "Cluster Dendrogram for Solution hello", xlab= 
  "Observation Number in Data Set transpose", sub="Method=ward; 
  Distance=euclidian")

###all the abiove works fine, except the bwlo where i try the pvclust

library(pvclust)

fit <- pvclust(transpose, method.hclust="ward",
   method.dist="euclidean")

plot(fit) 
pvrect(fit, alpha=.95)
###

the error comes back=


library(pvclust)
> fit <- pvclust(transpose, method.hclust="ward",
+    method.dist="euclidean")
Warning in dist(t(x), method) : NAs introduced by coercion
Error in hclust(distance, method = method.hclust) : 
  NA/NaN/Inf in foreign function call (arg 11)
> plot(fit) 
Error in plot(fit) : object 'fit' not found
> pvrect(fit, alpha=.95)
Error in nrow(x$edges) : object 'fit' not found
R • 5.5k views
ADD COMMENTlink modified 16 months ago by Biostar ♦♦ 20 • written 9.8 years ago by Max10

put your csv in a web accessible location

ADD REPLYlink written 9.8 years ago by Jeremy Leipzig19k
6
gravatar for Neilfws
9.8 years ago by
Neilfws49k
Sydney, Australia
Neilfws49k wrote:

Basically, what the errors are telling you is this:

  1. When you run pvclust(), it calls a function named dist(). For some reason, this introduces NA (= missing values) into the data.
  2. pvclust() then uses the function hclust(), which fails because it cannot accept missing values.
  3. There is then no point running plot() or pvrect(), because the preceding operations failed to generate the object named 'fit', hence: object 'fit' not found.

So your problem is to figure out why "NAs introduced by coercion" is occurring. It's probably because the input data (transpose) is not in the correct form.

Can't provide much more help than that without seeing the CSV file. I suggest reading the package documentation (difficult for beginners, I know) to try and understand how each function works and what arguments it expects. For example, pvclust comes with some sample data which you can load using data(lung), examine to see how it is structured and run through some of the package functions. See the pvclust PDF.

Copy/paste from the Web is rarely a good idea when you don't understand the underlying processes.

ADD COMMENTlink modified 9.8 years ago • written 9.8 years ago by Neilfws49k

thanks you were right, the data needed to be represented as a table i.e., t(x). thanks, fixed

ADD REPLYlink written 9.8 years ago by Max10

You're welcome. Votes for answers are appreciated.

ADD REPLYlink written 9.8 years ago by Neilfws49k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2460 users visited in the last hour
_