Question

Matthews Correlation Coefficient Vs. Receiver Operating Characteristic

3

Entering edit mode

11.3 years ago

Pappu ★ 2.1k

I am wondering which measure is better in what case for accessing the results of a binary classifier like SVM.

python • 9.5k views

ADD COMMENT • link 11.3 years ago by Pappu ★ 2.1k

score 6 · Answer 1 · 2013-01-21

It depends on what you want to know/say about the performance characteristics of the classifier. I am assuming for ROC what you actually intend to report is the Area under the Curve? Personally I think it is best to report several measures of the performance. In the past I have reported the AUC-ROC, AUC-PR (Precision-Recall), Matthews Correlation Coefficient, F-statistic, and some of the rawer numbers for several different datasets.

It has been argued that the MCC is a more balanced summary statistic of the confusion matrix when you have unbalanced classes, and I tend to agree. In my experience the MCC was more stable over differing class sizes and balances between class sizes compared to ROC curves, although the statistician we work with regularly wasn't entirely convinced of the theoretical underpinnings of that argument.

If you only have one reasonable cut-off for your classifier the MCC is more useful in that case as well, since by definition a ROC curve is constructing the performance over all possible cut-offs/conditions for the SVM.