Next step after regression analysis? Is it clustering?
0
4
Entering edit mode
8.0 years ago

Hi,

I am an old-fashioned biologist with a gene of interest. After learning how to download and play with TCGA data, i found that expression of my gene-of-interest correlates with expression of several genes that belong to a specific pathway. It fits my hypothesis that upregulation of my gene-of-interest would cause (or at least be associated with) upregulation of those other genes. But is there anything I can do now informatically?

I would be interested to know if i can identify a subgroup of patients in the TCGA dataset that have high expression of my gene and its individual correlates (or a subgroup that has low expression of my gene of interest and its correlates). Can anybody suggest a way to do that? Or a paper I can read?

thanks in advance. confused (but evolving) biologist

tcga gene expression stat regression R • 1.9k views
ADD COMMENT
0
Entering edit mode

Is Factor Analysis a good option for this?

ADD REPLY
0
Entering edit mode

Have you done this initial analysis for a single sample (or more than one)?

There are several portals that allow access to TCGA data over the web. Examples are http://www.cbioportal.org/ https://dcc.icgc.org/ and http://www.oncolnc.org/. You could start identifying samples that interest you using the portals.

ADD REPLY
0
Entering edit mode

Hi,

thank you for your reply. I have done this initial analysis in the TCGA BRCA provisional dataset (the one with 1105 patient samples). I used linear regression and found that my expression of my gene of interest was predictive of expression of several different genes which belong to MAPK pathways.

i used the cbioportal to download the dataset, and I see that i can pick high or low expressors of my gene of interest. But what is the appropriate statistical test after that?

If i separate out say the highest and lowest quartiles of my gene of interest, should I then do T-tests to prove that they have differential expression of MAPK genes? I know how to do that individually, but is there a way to do that for all the genes? and is there a way to come up with some kind of score for MAPK pathway upregulation so that I can test whether that score explains how my gene of interest affects survival?

thanks for your time.

ADD REPLY

Login before adding your answer.

Traffic: 2372 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6