Question: Appropriate explanation of function resamples and diff comparing resampling distributions of 3 different training models with caret in R
0
3.1 years ago by
svlachavas680
Greece
svlachavas680 wrote:

Dear Community,

with the initial purpose of comparing 3 groups of different features, in the same set of samples, regarding a binary categorical outcome in a microarray dataset, i used the same algorithm in R (random forests) with the train() function, as with the same random seed, like the following illustrative example:

``````set.seed(1)

t.group1 <- train(x=group1, y=outcome, method = "rf", trControl = control,metric = "ROC")

set.seed(1)

t.group2 <- train(x=group2, y=outcome, method = "rf", trControl = control,metric = "ROC")

set.seed(1)

t.group3 <- train(x=group3, y=outcome, method = "rf", trControl = control,metric = "ROC")

models <- list(group1 = t.group1, group2=t.group2, group3 =t.group3)

resample_results <- resamples(models)

difValues <- diff(resample_results)
summary(difValues)

Call:
summary.diff.resamples(object = difValues)

Upper diagonal: estimates of the difference
Lower diagonal: p-value for H0: difference = 0

ROC
group1     group2    group3
group1                 -0.01056  -0.06222
group2     1.0000000             -0.05167
group3     8.857e-08    0.0004048

Sens
group1   group2    group3
group1                0.03333 -0.02333
group2     0.72149              -0.05667
group3     0.75545    0.03167

Spec
group1     group2     group3
group1              -0.060000 -0.066667
group2   0.1978                -0.006667
group3   2.623e-05     1.0000
``````

My main confussion here, is about the mentioned upper and lower diagonals in the caret package-in other words, how here i could inspect which are the relative adjusted p-values about the comparisons of each model/group feature, and if there differences ?