ROC curve for biomarkers
2
0
Entering edit mode
7.5 years ago

Hello,

For last several days I am trying to draw ROC Curve for my biomarkers study. Unfortunately I did not find any good explanation how it can be done for biomarkers. I would appreciate if you can give me some guidelines.

My current results are from Desq2 and I am not sure how to prepare data input for drawing ROC Curve and also which tool is best to draw it.

Thanks a lot in advance

-- Andrzej

RNA-Seq R next-gen • 4.7k views
ADD COMMENT
1
Entering edit mode
7.5 years ago
ddiez ★ 2.0k

For plotting I would recommend the ROCR package. See also this and this website. How to use it for biomarker study depends very much on what exactly are you doing.

ADD COMMENT
0
Entering edit mode

Thanks a lot for suggesting tools.

I am doing Differential Expression Analysis using Deseq2 between healthy and unhealthy groups. I have around 20 significantly express genes. I need to check which biomarkers (i.e 3 or more) will give me good AUC (i.e. need value around 0.98).

ADD REPLY
3
Entering edit mode

Just a tip, you probably want to be careful about how you train and validate (samples left aside to assess performance) your biomarker predictions. An AUC of .98 is generally very high performance for most tasks. Often if you evaluate the performance of your biomarker/predictions based on the same data you used to find them, it will substantially overfit your data and consequently won't generalize to new data (i.e. not reproducible).

ADD REPLY
0
Entering edit mode

Of course you are right. I thought about AUC between 0.8 - 0.9.

ADD REPLY
0
Entering edit mode

I am not an expert at all on ROC but my understanding is that it can be used to determine the performance of a classifier. In you description, it is not clear to me whether you are doing classification. Are you trying to find whether any of the DE genes can be used as biomarkers? That is, whether they can distinguish between healthy and disease? How do you define true positives? (Well, I guess this illustrates my ignorance on the topic).

ADD REPLY
0
Entering edit mode

Yes, you are right. My goal is to find whether any of DE genes can be used as a biomarker for the specific disease (exactly to discriminate between healthy and disease). I want to create prediction curves and to check what AUC value will give me the combination of 3, 5 or 7 chosen genes (how much it will improve) based on the ROC graph. True positive will be the case if the specific biomarker detects the disease (and it is really true in reality).

ADD REPLY
1
Entering edit mode

It might be that I misunderstood something what you are doing, but shouldn't you construct the model using one dataset (training set ) and use an independent dataset to evaluate the performance to create ROC?

ADD REPLY
0
Entering edit mode
6.3 years ago
antgomo ▴ 30

So, your idea will be (correct me if I am wrong):

Imagine you have a set of 2000 DE genes from your DEseq2 analysis and you want o go iteratively generating subsets of 7-10 genes that enter in randomForest/SVM feature classification of samples, and the group/combinaion of genes which reach a AUC of 0.98, will be your signature. Isn't it?

ADD COMMENT

Login before adding your answer.

Traffic: 2330 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6