21 months ago by

University College London

I presume that you know how to derive a final predictive model? 1,300 genes seems too large. I have posted various answers in relation to this, such as:

Once you decide on your final model, you can test sensitivity, specificity, precision, accuracy, and AUC (and other metrics) by applying your model to the same data on which your model was created and also new data, i.e., a validation cohort.

Here:

- our regression model is called
`bayesmodel`

- our data (the data on which the model was built or the new data) is
called
`modeling`

- our outcome / phenotype is stored in a variable called
`Disease`

(0 /
1) in `modeling`

.

```
require(ROCR)
```

# make a model prediction

```
pred <- prediction(predict(bayesmodel, type="response"), modeling$Disease)
```

# determine sensitivity and specificity

```
ss <- performance(pred, "sens", "spec")
print("Predicted probability cut-off:")
ss@alpha.values[[1]][which.max(ss@x.values[[1]] + ss@y.values[[1]])]
predicted.prob <- round(ss@alpha.values[[1]][which.max(ss@x.values[[1]] + ss@y.values[[1]])], 2)
print("Sensitivity and specificity, weighed equally, are:")
max(ss@x.values[[1]] + ss@y.values[[1]]) / 2
sens.spec <- round(max(ss@x.values[[1]] + ss@y.values[[1]]) / 2, 2)
```

# determine cost function

```
cost.perf <- performance(pred, measure="cost")
cost <- pred@cutoffs[[1]][which.min(cost.perf@y.values[[1]])]
```

# determine precision

```
prec.perf <- performance(pred, measure="prec")
ind <- which.max(slot(prec.perf, "y.values")[[1]])
prec <- slot(prec.perf, "y.values")[[1]][ind]
prec.cutoff <- slot(prec.perf, "x.values")[[1]][ind]
print(c(precision=prec, cutoff=prec.cutoff))
```

# determine accuracy

```
acc.perf <- performance(pred, measure="acc")
ind <- which.max(slot(acc.perf, "y.values")[[1]])
acc <- slot(acc.perf, "y.values")[[1]][ind]
acc.cutoff <- slot(acc.perf, "x.values")[[1]][ind]
print(c(accuracy=acc, cutoff=acc.cutoff))
```

# plot ROC 'curve' and optimum accuracy

```
par(cex=0.8, mfrow=c(1,2))
require(pROC)
roc <- roc(na.omit(modeling$Disease), predict(bayesmodel, type="response"), smooth=FALSE)
plot.roc(roc, add=FALSE, grid=TRUE, grid.lwd=2, col="red", main="")
text(0.25, 0.2, col="red",
paste("Variant signature:\nAUC [L95-U95]:",
paste(round(ci.auc(roc)[2], 3), " [",
round(ci.auc(roc)[1], 3), "-",
round(ci.auc(roc)[3], 3), "]", sep="", collapse=","), sep=" "),
cex=1.1, font=2)
text(0.25, 0.05, col="red",
paste("Sensitivity/Specificity: ", sens.spec*100, "%", sep=""), cex=1.1, font=2)
# accuracy
plot(acc.perf, main="Accuracy, variant signature", avg="vertical", spread.estimate="boxplot")
abline(v=acc.cutoff, lwd=1, lty=2, col="red")
```

Ignore the odd looking ROC curve for now. I`ve also edited out key info form these plots/ hence, the weird text labels.

Hope that it helps!

Kevin