Question: random forest advice
0
gravatar for marion.ryan
3 months ago by
marion.ryan40
marion.ryan40 wrote:

I am using random forest package to predict 'norm' versus 'chol', with the code below and have got a nice output regarding the importance of a panel of genes contributing to the classification of diseased tissues however I have been reading up on this and am wondering if I need a training and test data set, I have 11 normal and 18 diseased. I am very happy with the intuitive outputs this is giving but want to make sure its right

library(randomForest) clus2<-read.csv("PCA_NvC_SVM_sig.csv", sep = ",", header = T, row.names = 1) 
attach(clus2)
set.seed(71) 
clus2.rf <- randomForest(Pathology ~ ., data=clus2, importance=TRUE, proximity=TRUE) 
print(clus2.rf)

result Call: randomForest(formula = Pathology ~ ., data = clus2, importance = TRUE, proximity = TRUE) Type of random forest: classification Number of trees: 500 No. of variables tried at each split: 4

OOB estimate of error rate: 10.34% Confusion matrix: Chol Norm class.error Chol 17 1 0.05555556 Norm 2 9 0.18181818

Look at variable importance:

Imp<-round(importance(clus2.rf), 2) write.table(Imp, "Importance.csv",sep=",") varImpPlot(clus2.rf)

ADD COMMENTlink written 3 months ago by marion.ryan40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 990 users visited in the last hour
_