Question: How to select the optimal genes for neural network analysis?
0
gravatar for Avro
5.0 years ago by
Avro140
Canada
Avro140 wrote:

Hi everyone!

I am working with a cancer mouse model that produced tumors, and we have performed gene expression profiling on all of them. I would be interested in building a classifier to identify human tumors, based on their gene expression, that are similar to my model (i.e. "mouse-like"). The microarrays have 27000+ features. I suspect that I don't need as many features. Hence, I was wondering if there were a methodology to pick the best number/nature of parameters? I know that it is counter-intuitive because I shouldn't look at the data before I apply machine learning. I am currently reading papers. 

Thank you for your input!

neural network gene • 1.4k views
ADD COMMENTlink modified 5.0 years ago by jgbradley1100 • written 5.0 years ago by Avro140

It IS safe to filter genes to those with high variance; this would be a quick and easy way to get a reasonable set for classification.

ADD REPLYlink written 5.0 years ago by Sean Davis25k

Hi! Thank you for your quick response. Could I use a nonparamteric ranking test (e.g. Wilcoxon) to get the genes with the highest variance?

ADD REPLYlink written 5.0 years ago by Avro140

No.  You may not use any measure of variability that includes the classes.

ADD REPLYlink written 5.0 years ago by Sean Davis25k

Thank you! Is there a way to have a cutoff for the variance? I am asking because the variance values will be continous. Bootstrap resampling?

ADD REPLYlink written 5.0 years ago by Avro140

There is no "cutoff".  I suspect that you'll find that there is a pretty broad range that can result in similar performance.

ADD REPLYlink written 5.0 years ago by Sean Davis25k
0
gravatar for jgbradley1
5.0 years ago by
jgbradley1100
United States
jgbradley1100 wrote:

Have you considered reducing the number of features by doing some kind of PCA (Principal Component Analysis) approach?

ADD COMMENTlink written 5.0 years ago by jgbradley1100

Hi! Thank you for your quick response. Yes. PCA reduces dimensionality while keeping the variability, right? I am reading more about it. By reducing the dimensionality, would I remove gene dimensions or would I create parameters (no gene loci, e.g. z1, z2,...)? Sorry if I am confused. 

ADD REPLYlink written 5.0 years ago by Avro140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1946 users visited in the last hour