Microarray Class Prediction - For Continuous Data?
3
5
Entering edit mode
9.9 years ago

I was wondering if anyone could help point me in the right direction for the following problem (changed slightly to improve comprehensibility).

Let's say that I have a set of 500 microarrays taken from blood samples from 500 different people. Each person is a different age. I want to build a classifier that can predict a person's age based off as few genes as possible. If there were two classes of people ("young" and "old"), I could use a straightforward binary classification algorithm. But I want to predict a person's exact age - so I'm not sure what classification method to use to incorporate what's basically continuous data (500 different ages) rather than just 2 classes. Thanks!

microarray classification prediction • 2.8k views
ADD COMMENT
4
Entering edit mode
9.9 years ago
Johan ▴ 880

I'm not specifically familiar with microarray data. But what think that what you are looking to create is a regression model. (http://en.wikipedia.org/wiki/Regression_analysis) The most basic example of this being linear regression.

To reduce the number of genes used by the classifier you might want to look into feature selection (http://en.wikipedia.org/wiki/Feature_selection) - this should help you select a subset of genes.

There are a number of programs which implement general machine learning algorithms, WEKA which a alot of people seem to use. And RapidMiner, which I personally prefer. If you want a good starting place for learning rapid miner, this blog and accompanying youtube channel should give you a good start: http://vancouverdata.blogspot.se/

As I said, I have never worked with microarray data, but the methods that I mention should be transferable to any machine learning problem. Hope that this helps. :)

ADD COMMENT
4
Entering edit mode
9.9 years ago

Johan is correct; what you have is a regression problem, not a classification problem. First, I suggest you read up on the elements of linear models. Dalgaard's book is very accessible. Then, consider looking at the Lasso, which is a selection method for linear regression (i.e. it attempts to find the smallest set of features which provide a good fit). There is a large literature here, but this is one place to start. Several libraries in R implement variants of the lasso (google "r lasso" or head over to CRAN).

ADD COMMENT
0
Entering edit mode
7.8 years ago

my study is related to your study, i used Least Angle Regression and Lasso, but im looking for a microarray data for me to use. i need your help, i need a microarray data set. thank you!

ADD COMMENT

Login before adding your answer.

Traffic: 2090 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6