Rocr With Pls-Predicted Class Values?
1
2
Entering edit mode
13.2 years ago

Last Wednesday I was looking for the first time at ROCR, a R package for making receiver operating curves. I would like to use it in a setup like this: I have two classes, labeled -1 and 1, and use PLS to predict the class based on a numerical representation of my input. This is a single output PLS model, and the PLS model predicts a single analogue value; if that value is smaller than 0 (the cutoff) then it is predicted to be in the -1 class, etc.

So, the result was like:

-1.51    -1
-0.75    -1
 0.43     1
 0.98     1

With the the class labels in the second column and the predictions in the first. The first can easily be interpreted as probabilities, and ROCR may have just been doing that.

Now, I am not sure what I should pass as first argument to the ROCR prediction() method, which ambiguously refers to 'predictions'. Can I use ROCR out of the box if my predictions are as described as above, or must I convert them to class probabilities first? I prefer the former, and I would like to learn how I should properly use ROCR to make plots which such input. The key is really if ROCR can use such input at all, not how I can change my PLS modeling, which I know how to do.

r classification • 3.5k views
ADD COMMENT
0
Entering edit mode
13.2 years ago

Yes and no, you can use prediction without any modification, just provide the class.

But in the case when you predict class with your model, you already providing the threshold for classification and ROC give you just a prediction quality picture. I will suggest you to test the threshold from -1 to 1 with 0.1 step to test what gives you the better classification - but it will gives you some promiscuity. In most of the cases modelers argue that the mean threshold is the "de facto" point (zero in your case, 0.5 in 0-to-1).

Here is piece of code I'm using (old one, little bit unoptimized). The roc.txt contains predictions for different classifications, class.txt - contains classes for predictions (equal to the number of classifications) - unoptimized. This gave my a lot of ROC curves on one picture.

library("ROCR")
png("target_ROC.png", bg="white")
target_pred <- read.table("roc.txt", sep=",", header=FALSE)
target_pred <- as.matrix(target_pred)
target_class <-read.table("class.txt", sep="\t", header=FALSE)
target_class <- as.matrix(target_class)
pred <- prediction(target_pred, target_class)
perf <- performance(pred,"tpr","fpr")
plot(perf,col="black",lty=3)
dev.off()
ADD COMMENT
1
Entering edit mode

Sorry, it seems that SE do not send me a letter about comments. Take a look at the examples at https://github.com/chupvl/R_scripts

ADD REPLY
0
Entering edit mode

please add some examples rows of roc.txt and class.txt... it's not useful this way.

ADD REPLY
0
Entering edit mode

btw, I know what the treshold does... in fact, ROCR changes that treshold (cutoff) to give the curve... so, no need to change that myself. Otherwise, with -1, 1, a cutoff of 0 seems a good initial guess.

ADD REPLY
0
Entering edit mode

btw, I know what the treshold does... in fact, ROCR changes that treshold (cutoff) to give the curve... so, no need to change that myself. Otherwise, with -1, 1, a cutoff of 0 seems a good initial guess, just like you indicate.

ADD REPLY

Login before adding your answer.

Traffic: 1452 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6