Question: Decision tree algorithms
0
gravatar for AQ7
2.5 years ago by
AQ730
AQ730 wrote:

Goodmorning everyone,

I'm pretty new in decision tree models and I am trying to use it with metagenomis 16S data. My data looks like that

> str(data)
'data.frame':   156 obs. of  71 variables:
 $ Class                     : Factor w/ 2 levels "cluster1","cluster2": 1 2 1 2 1 1 1 2 1 1 ...
 $ Roseburia                  : num  0.738 1.228 5.414 0.468 0.232 ...
 $ Anaerostipes               : num  0.5978 0.0101 0.6828 0.3118 0 ...
 $ g__.Ruminococcus.          : num  0.330751 0 0.000615 0.114379 0.091286 ...
 $ g__.Ruminococcus.torques   : num  3.399 0.438 0.08 1.324 0.037 ...
 $ Ruminococcaceae            : num  48.16 4.31 16.84 37.66 9.07 ...
 $ Faecalibacteriumprausnitzii: num  0.907 46.443 2.885 17.729 3.378 ...
 $ Coprococcus                : num  0.8083 0.0705 0.0258 0.5133 0 ...
 $ Dehalobacterium            : num  0 0 0 0 0 ...
 $ Dialister                  : num  7.077 0.601 14.224 9.346 2.869 ...
 $ Acidaminococcus            : num  0.29891 0 0.00123 0 0 ...
 $ Coriobacteriaceae          : num  0.1132 0.00252 0.02214 0.60866 0.07402 ...
 $ Streptococcus              : num  0.7499 0 0.0418 0.3622 0.8536 ...
 $ Eggerthellalenta           : num  0.09197 0 0.00677 0.07625 2.22293 ...
 $ Adlercreutzia              : num  0 0 0.00123 0.06808 0 ...
 $ Collinsellaaerofaciens     : num  0.01061 0.01258 0.05782 0.33905 0.00493 ...
 $ Actinomyces                : num  0.11674 0 0.00246 0.00817 0.00987 ...
 $ Bifidobacteriumlongum      : num  0.3219 0.0654 0.1107 0.4385 0.2566 ...
 $ Atopobium                  : num  0.373 0 0 0 0 ...
 $ Turicibacter               : num  0.40327 0.00755 0.00123 0.33905 0.27632 ...

and so on... and I'm running the following commands

library("RWeka")
data=read.table("Provadecision.txt",header=T)
DecisionTree <- J48(Class ~.,  data = data)
DecisionTree
summary(DecisionTree)
if(require("party", quietly = TRUE)) plot(DecisionTree)

or

library("rpart")
library("rpart.plot")
binary.model<-rpart(Class ~.,  data = data)
rpart.plot(binary.model)

But I get a completely different result with these two methods. I am pretty confused about it, which one is the right one for my situation? Could anyone please give me any suggestion about why I am getting two different results? thanks a lot Andrea

metagenomics R • 438 views
ADD COMMENTlink modified 2.4 years ago by Biostar ♦♦ 20 • written 2.5 years ago by AQ730

You're getting different results because you're using different algorithms. Weka's J48 is an implementation of the C4.5 decision tree algorithm while the rpart package implements the older CART algorithm. Check the docs of the package you're using. As for which one you should use, you should have some ways of evaluating/comparing the results otherwise, there's no way to decide.

ADD REPLYlink written 2.4 years ago by Jean-Karim Heriche24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 970 users visited in the last hour
_