How to perform classification of Differentially expressed genes using Random Forest?
1
0
Entering edit mode
5.7 years ago
gurudeeb99 ▴ 20

Dear All,

I try to implement deep learning for cancer biomarkers. I read your article ( Machine Learning For Cancer Classification - Part 2 - Building A Random Forest Classifier ), it very useful to me, but I have encountered more errors.

In my study, the DEGs were extracted from the meta-analysis. Then, I performed 10 fold cross validation, its fine. In addition, I try to classify the data using RF. I performed RF as per example scripts. However, I encountered an error as follows

for(i in cols){
+   DataFrame[,i]=as.factor(DataFrame[,i])
+ }

Error: Can't use matrix or array for column indexing

This is my model script

library (readxl)
library(randomForest)
set.seed (123)
DataFrame <-ADSC_RF 
View (DataFrame)
structure (DataFrame)
dim(DataFrame)
head(DataFrame,3)
summary(DataFrame)
apply(DataFrame,2,function (x) length (unique(x)))
cols<-c("Target")
for(i in cols){
  DataFrame[,i]=as.factor(DataFrame[,i])
}
str(DataFrame)
library (caTools)
ind = sample.split (Y=DataFrame$Target,SplitRatio = 0.7)
trainDF<-DataFrame[ind,]
testDF<-DataFrame[!ind,]
modelRandom<-randomForest(Target~.,data = trainDF,mtry=3,ntree=20)
Yes
modelRandom
importance(modelRandom)
varImpPlot(modelRandom)
PredictionWithClass<-predict(modelRandom, testDF, type = 'class')
PredictionWithClass
t<-table(predictions=PredictionWithClass, actual=testDF$Target)
t
sum (diag(t))/sum (t)
library(pROC)
PredictionsWithProbs<-predict(modelRandom, testDF, type = 'prob')
PredictionsWithProbs
auc<-auc(testDF$Target, PredictionsWithProbs[,2])
auc
plot(roc(testDF$Target,PredictionsWithProbs[,2]))
bestmtry<-tuneRF(trainDF, trainDF$Target, ntreeTry=200, stepFactor = 1.5, improve =0.01, trace =T, plot =T)

This is my source file,

A               B              C              D             E              F                G                H    I Target
3886.10 1566.40 3336.30 269.77  2386.10 826.20  2728.20 4707.10 3462.10 1
845.29  783.52  909.08  111.97  888.53  167.00  728.97  1111.20 994.12  1
52.43   57.13   1740.30 269.53  1595.60 454.48  1296.00 1528.30 1312.40 1
521.30  170.27  2205.00 208.64  2141.10 567.01  1711.40 1928.50 1692.70 1

I feel, I have to change the column name and include the expression library. But, I am not sure.

Can you help me to fix this issue?

R RandomForeset DEGs Cancer Classification • 2.2k views
ADD COMMENT
1
Entering edit mode
5.7 years ago

That bit of the code makes no sense. Replace this:

DataFrame[,i]=as.factor(DataFrame[,i])

with this:

DataFrame[[i]]=as.factor(DataFrame[[i]])

Alternatively, remove the whole for loop and just use DataFrame$Target = as.factor(DataFrame$Target).

ADD COMMENT
0
Entering edit mode

Dear Ryan,

Thanks a lot. It's working now with the comment "DataFrame$Target = as.factor(DataFrame$Target)".

ADD REPLY

Login before adding your answer.

Traffic: 2017 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6