Question

How to test the robustness and the performance of a novel classification algorithm for gene expression data

0

Entering edit mode

10.2 years ago

fbrundu ▴ 350

Hi all,

suppose you have a new algorithm that you want to publish. Are there any best practices and methodologies you usually consider in order to test the robustness and performance of new methods?

The case is a novel classification algorithm for gene expression samples.

Thanks

algorithm robustness performance test • 4.0k views

ADD COMMENT • link updated 10.2 years ago by Christian ★ 3.0k • written 10.2 years ago by fbrundu ▴ 350

Ram · Answer 1 · 2014-05-11

2

Entering edit mode

10.2 years ago

Charles Warden 8.2k

In the very least, you should do cross-validation (like leave-one-out-cross-validation) on a dataset. You can also apply the algorithm to other publicly available datasets (if they have metadata for the characteristic that you are trying to predict), which I think is a better test.

In both cases, you can use something like the ROCR package to create and ROC plot showing the tradeoffs between sensitivity and specificity. Creating a table with statistics like the positive predictive value and negative predictive value would also be nice. However, these are all relevant for binary variables - not sure if that is what you are trying to predict.

ADD COMMENT • link updated 4.6 years ago by Ram 44k • written 10.2 years ago by Charles Warden 8.2k

0

Entering edit mode

Thanks for the suggestions. I will try them.

ADD REPLY • link 10.2 years ago by fbrundu ▴ 350

Ram · Answer 2 · 2014-05-11

1

Entering edit mode

10.2 years ago

mikhail.shugay 3.5k

If you mean classification in a strict sense, i.e. supervised clusterisation of samples based on gene expression than the most basic things to do are:

Train your classifier with a relatively large negative and positive sets. Report precision and recall using cross-validation
Select positive and negative validation sets (more is better), ensure that those samples were not used during the training and report precision and recall of trained classifier on those sets

Second step is really critical to show that you're not over-fitting the data..

ADD COMMENT • link 10.2 years ago by mikhail.shugay 3.5k

0

Entering edit mode

Thanks Mikhail. What do you mean with negative and positive sets?

ADD REPLY • link 10.2 years ago by fbrundu ▴ 350

0

Entering edit mode

I mean if you have a binary classifier tells e.g. that a sample comes from a tumor or normal tissue, then positive set will be tumor expression datasets and negative sets will be normal expression datasets. Of course it all depends on what your classifier is meant to do..

ADD REPLY • link 10.2 years ago by mikhail.shugay 3.5k

0

Entering edit mode

Unfortunately it is not a binary but a n-classifier. Is there any related technique it is used the most?

ADD REPLY • link 10.2 years ago by fbrundu ▴ 350

0

Entering edit mode

The simplest way is to split the problem to several binary classification ones. So the positive set will be some sample type and the negative set will be comprised of other types. Note that positive sets should have a sufficient number of associated samples. For sample types characterized by few samples it will be better to leave them aside and then manually check if they are classified to a reasonable cluster. For accuracy measures for n-classification problem have a look at http://rali.iro.umontreal.ca/rali/sites/default/files/publis/SokolovaLapalme-JIPM09.pdf

ADD REPLY • link updated 4.6 years ago by Ram 44k • written 10.2 years ago by mikhail.shugay 3.5k

score 1 · Answer 3 · 2014-05-12

I stopped believing results from any classifier built from high-dimensional input data (like gene expression data sets) unless results are shown to replicate on a completely independent data set, ideally done by another research group. Cross validation is a minimum must-have, but even with it there is just too much data massaging and overfitting going on.

So if you have access to an independent data set, use it to assess the performance of your classifier before publishing, but be honest and don't cheat and tune your classifier afterwards to improve results. I know this sounds harsh, but the field has been pleagued by unreproducible and non-replicable results for too long.