How to select the optimal genes for neural network analysis?
1
0
Entering edit mode
9.5 years ago
Avro ▴ 160

Hi everyone!

I am working with a cancer mouse model that produced tumors, and we have performed gene expression profiling on all of them. I would be interested in building a classifier to identify human tumors, based on their gene expression, that are similar to my model (i.e. "mouse-like"). The microarrays have 27000+ features. I suspect that I don't need as many features. Hence, I was wondering if there were a methodology to pick the best number/nature of parameters? I know that it is counter-intuitive because I shouldn't look at the data before I apply machine learning. I am currently reading papers.

Thank you for your input!

gene neural-network • 2.3k views
ADD COMMENT
0
Entering edit mode

It IS safe to filter genes to those with high variance; this would be a quick and easy way to get a reasonable set for classification.

ADD REPLY
0
Entering edit mode

Hi! Thank you for your quick response. Could I use a nonparamteric ranking test (e.g. Wilcoxon) to get the genes with the highest variance?

ADD REPLY
0
Entering edit mode

No. You may not use any measure of variability that includes the classes.

ADD REPLY
0
Entering edit mode

Thank you! Is there a way to have a cutoff for the variance? I am asking because the variance values will be continous. Bootstrap resampling?

ADD REPLY
0
Entering edit mode

There is no "cutoff". I suspect that you'll find that there is a pretty broad range that can result in similar performance.

ADD REPLY
0
Entering edit mode
9.5 years ago
jgbradley1 ▴ 110

Have you considered reducing the number of features by doing some kind of PCA (Principal Component Analysis) approach?

ADD COMMENT
0
Entering edit mode

Hi! Thank you for your quick response. Yes. PCA reduces dimensionality while keeping the variability, right? I am reading more about it. By reducing the dimensionality, would I remove gene dimensions or would I create parameters (no gene loci, e.g. z1, z2,...)? Sorry if I am confused.

ADD REPLY

Login before adding your answer.

Traffic: 1830 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6