Question

A question on microarray-based validation study

0

Entering edit mode

7.2 years ago

BioMed ▴ 50

Dear friends,

Our purpose is to detect a set of genes (genetic signature) for the diagnosis and sub-group classification of a particular disease. Therefore, we tried a gene expression using microarray-based experiment using two sets of data, let say: exploration (E) and validation (V). Here is our problem.

Using exploration set and applying class assignment analysis on arraymining.net (eBayes, SVM, 10-fold validation), we found a 53-gene signature that can classify disease versus normal. Now we want to test the robustness of that results using validation set. I guess that we just need to put the gene expression of 53 genes that were detected in the previous analysis and perform the same analysis using arraymining.net. Nevertheless, it seems somehow wrong to me.

Please give me your advice and/or solution.

Thank you very much and I look forward to hearing from you.

Respectfully yours,

gene validation • 1.4k views

ADD COMMENT • link updated 7.2 years ago by Jean-Karim Heriche 27k • written 7.2 years ago by BioMed ▴ 50

score 1 · Answer 1 · 2017-02-18

1

Entering edit mode

7.2 years ago

Jean-Karim Heriche 27k

Unless I misunderstand something, I don't see anything wrong in using the classifier you've trained on one data set to classify another similar data set. This is the best test of the robustness of the classifier. If it performs well on an independent data set then it is probably robust. If not, this means your classifier overfit the training data.

ADD COMMENT • link 7.2 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Dear Heriche,

Thank you for your answer. It is better now.

However, we now got a technical problem. If we try the feature provided by arraymining.com for testing our 53-gene signature, the algorithm will try to select a smaller set of genes for prediction. Therefore, it is actually not a validation of the classifier. Could you please advice us or provide some hints on the analysis tool?

Respectfully yours,

ADD REPLY • link 7.2 years ago by BioMed ▴ 50

1

Entering edit mode

I don't know how arraymining.com works but from what you write, it seems to me that you can't reapply an already trained classifier. I would suggest to do the analysis yourself. For example in R, empirical Bayes statistics are available in the limma package and SVM in the kernlab package.