Random forest with DE genes
1
0
Entering edit mode
8 weeks ago
jmoon1194 • 0

Hi all,

I have some rna-seq data with 2 classes (cancer/normal) that I ran DESeq2 on to obtain significant DE genes. My lab is interested to see how well our 'significantly DE' genes can classify cancer/normal samples and is using the AUC score/plot to viz the performance.

Does it make sense to train the random forest with the same pre-defined list, instead of using a feature selection method? Can this artificially inflate the AUC scores if used with LOOCV?

Please let me know your thoughts, questions or concerns you may have. I am fairly untrained and want to learn as much as I can (but am under pressure to deliver with no guidance/mentorship).

Thank you for your time, J

RandomForest RNA-seq R • 357 views
ADD COMMENT
2
Entering edit mode
8 weeks ago
dsull ★ 3.7k

Yes, you should use your pre-defined list since you want to see how well those genes can classify tumor vs normal.

You can use LOOCV for validation but you should test your classifier on an unseen dataset (i.e. which you have not already run deseq2 on and have not looked at previously).

All that said, not sure what you'd gain from such an analysis; you already have your deseq2 results which tell you which genes are, on average, higher (or lower) in tumor vs. normal. Those genes being good at classifying tumor vs. normal is not that outstanding of a result.

ADD COMMENT
0
Entering edit mode

Thank you for your reply! I greatly appreciate it. I agree with you on the last point of this not being very informative... but gotta do as I'm told for now :/

One last question, for testing the classifier on an unseen dataset- can it be any tumor/normal tissue dataset? Or does it have to be from the same tissue type (parathyroid in this case).

ADD REPLY
1
Entering edit mode

Ideally the same tissue type; your classifier is unlikely to perform well on a different tissue type (though you can try). Generally, training on apples and testing on oranges does not yield good results.

ADD REPLY

Login before adding your answer.

Traffic: 2370 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6