Differentially expressed genes machine learning classifer
2
1
Entering edit mode
8 months ago
devknight009 ▴ 40

I am new to R and machine learning. I want to create a machine learning classifier which can classify between Normal and diseased sample using Differentially expressed genes obtained from GEO microarray datasets, as input features. I have obtained my DEGs using limma package. Now how to use DEGs to train the machine learning classifier ? plz help

R Machine Learning DEGs • 634 views
ADD COMMENT
0
Entering edit mode

I am new to ... machine learning

Why do you want to use ML here? What arre some existing methods, and what are some flaws in them that you're trying to solve using ML?

ADD REPLY
0
Entering edit mode

Using machine learning i want to show that these DEGs can act as biomarkers by differentiating normal sample from a diseased sample

ADD REPLY
1
Entering edit mode

A lot of people have done this kind of thing. I think classifiers for benign vs malignant thyroid tumors based of rna-seq was one i remember from a few years back. This is a googleable thing

ADD REPLY
1
Entering edit mode

Coincidentally, I have published in this area via a TCGA re-analysis, but not 'Machine Learning': Comprehensive transcriptomic analysis of papillary thyroid cancer: potential biomarkers associated with tumor progression

ADD REPLY
0
Entering edit mode

Yeah I tried to find the exact paper but couldn't, I think there is a groups of companies that do this though. Something like the sample needed for histology is invasive to get, so they just get a little bit of RNA and try to classify benign that way. Machine learning stuff is so in vogue, I think people want to use it to check a buzzword box, but I remember this application I thought was kind of neat and made sense.

ADD REPLY
0
Entering edit mode

What is lacking in GSEA etc. that ML can solve? What is your definition of "normal" and "diseased"? What are your DE groups?

ADD REPLY
0
Entering edit mode

I have to design a project related to ML. I have taken GEO microarray dataset , it has microarray data for control sample and parkinson's sample obtained from blood. Have found DEGs using limma, now want to use these DEGs for ML classification of Control and Parkinson's sample.

ADD REPLY
0
Entering edit mode

I have to design a project related to ML

That seems to be a sub-optimal way of approaching a problem. "Is ML useful here" should be the question. Anyway, like curious says, I'm sure there are a lot of people that have run classifiers on public datasets. Are you doing a toy project or a real one?

ADD REPLY
0
Entering edit mode

Its a real one, ML classification is first step in it

ADD REPLY
4
Entering edit mode

I think it is clear what you want to do. The thing is that biostars is intended to answer specific technical questions rather than guiding you along a topic that you apparently have no background in. I suggest you dive into the available online resources, textbooks and courses at your institution and get a solid foundation first. You will rarely find users online that will provide a end-to-end workflow for you, especially given that you want to develop something on your own.

ADD REPLY
0
Entering edit mode

I don't want an end to end workflow. I'll be happy if someone can suggest any particular R package to look at or any particular blog

ADD REPLY
6
Entering edit mode
8 months ago

You can probably do this using a something simple like a logistic regression classifier. Try searching for "logistic regression in R". Remeber that doing good ML is about more than just picking the correct algorithm. You have to carefully design training, validation and test sets, or use k-fold validation, and think carefully about what metrics you use to assess the performance of your model, particularly if you have unbalanced classes. Finally, you will ideally want a test test that comes from a different experiment this will ensure that your model is generalization. This means you'll need to think carefully about how to normalize the input data so that it is comparable across studies.

Personally, I found Andrew Ng's Coursea course on machine learning very useful to get to grips with the basic concepts in machine learning. It focuses on the surrounding concepts as much the algorithms/models themselves, which I found to be very helpful.

ADD COMMENT
2
Entering edit mode

If only Andrew Ng used Python or R instead of Octave! I cannot wrap my head around that bizarre language.

ADD REPLY
4
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 2423 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6