Deal With High Dimensions And Low Number Of Examples Gene Expression Data
1
0
Entering edit mode
10.9 years ago
mikewhity • 0

I have this general issue of high dimensionality and low number of examples gene expression data. Actually, I have some drug responses for some cancer cells and gene expression for those cells before the application of drugs. I want to relate the response to the gene expression, I mean explain the drug response from the gene expression. I only have around 18 examples and high dimensional gene expression of dimension 25000.

I tried with correlation analysis, see which genes are highly correlated with the drug response for each drug, select the highly correlated genes and used hypergeometric test to see if there are some pathways which is overrepresented in the genes/features for each drug.

However, I haven't got anything significant when running the pathway analysis. Any suggestions, how I should proceed.

pathway gene-expression • 2.5k views
ADD COMMENT
0
Entering edit mode

practically, enrichment analysis is vulnerable to the number of input genes(especially KEGG pathway enrichment), so, if you have small number of genes disturbed by the treatment, that would be unsupervised. if so you can pick out the disrupted genes according to the intensity fold changes among different treatment(ie case vs control), and take a further view on these most disrupted genes.

ADD REPLY
0
Entering edit mode
10.9 years ago

I'd suggest performing a hypothesis test. In particular, you may want to perform a regression drug response vs gene expression; the limma Bioconductor package can do this. Also, depending on your drug response assay, you may be interested in stratifying into "responders" and "non-responders" and use limma to test for differences in gene expression between the two groups of samples.

Contrary to popular belief (that you may or may not hold), finding nothing significant or understandable after running pathway analysis is not uncommon, so I would not use that as a the single measure of success or failure of your methods.

ADD COMMENT
0
Entering edit mode

I have tried the drug response evaluation. Based upon whether the drug kills the cells or not. I have separated them into two classes. The cells which are killed by the drug and cells which arent't. Then using hypothesis test like ttest, kruskal wallis test, I selected those genes which could be separated into the two classes. But didn't get something significant there as well. Again, I selected the genes that have distinction between the two classes and did hypergeometric test so see any significant pathways. I am not sure what limma will do and how it will help. I tried to install it before. But due to library issues, I couldn't install it so I couldn't try. Can you let me know what limma can do?

I am not sure what regression will help me for. I don't want a model which predicts the drug response based upon gene expression. I tried to create a linear regression model with lasso regularization. But that doesn't give me anything significant. I don't want a predictor. Any suggesions?

ADD REPLY
0
Entering edit mode

Limma has an extensive user guide. To install, first install R and then follow the instructions on the bioconductor website. If you have problems installing, feel free to write to the bioconductor email list with details.

As for significant genes, there is no guarantee that there will be any.

Finally, consider finding a local collaborator who has worked with gene expression data before; you can certainly spend a lot of time trying to reinvent wheels and troubleshoot issues that are not really problems.

ADD REPLY
0
Entering edit mode

@Sean Davis. Currently, I don't have any local collaborators. I wanted to know since I have very few samples, around 18 cells. Will that be enough to get significant results. I wanted to know if there are any other references where using such small samples anything significant has been done. Can you give me some resources that would help?

ADD REPLY
0
Entering edit mode

You are saying that you have 18 samples (18 arrays)? As for "is that enough", I cannot answer that since I do not know how large an effect the drug has on gene expression, but I would not consider 18 samples to be extremely small. In particular, 18 samples does not mandate taking any special approaches to analysis besides using microarray-specific tools.

ADD REPLY

Login before adding your answer.

Traffic: 3003 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6