Question

What statistical analysis to perform on data from gene expression analysis?

0

Entering edit mode

6.0 years ago

DanielC ▴ 170

Dear All,

I performed hierarchical (HCL) and Self-organized map (SOM) clustering on some cancer datasets, to identify the co-expression pattern of the genes involved in the cancer. For instance, I got data like this:

There are 25 datasets from GEO database (like GDSXXX) of microarray mRNA expression values. In each dataset, I looked for the co-expression pattern of a set of gene-pairs using HCL, and got results like this

pair1: gene x<--> gene y (found co-expressed) in all datasets from dataset 1 to dataset 25
Pair2: gene r <--> gene z (found co-expressed) in just 2 datasets among all datasets from dataset 1 to dataset 25
.
.
.
.
Pair 16: gene u <--> gene o (found co-expressed) in all datasets from dataset 1 to dataset 25

Similarly, I performed the co-expression analysis using SOM and got results like this:

pair1: gene x<--> gene y (found co-expressed) in all datasets from dataset 1 to dataset 25
Pair2: gene r <--> gene z (found co-expressed) in just 5 datasets among all datasets from dataset 1 to dataset 25
.
.
.
.
Pair 16: gene u <--> gene o (found co-expressed) in 10 datasets from dataset 1 to dataset 25

Can you please let me know what statistical analysis can I apply here to find out the most significant co-expressed gene-pair(s) in both HCL and SOM? Thanks much,

DK

statistical gene expression data • 1.3k views

ADD COMMENT • link 6.0 years ago by DanielC ▴ 170

0

Entering edit mode

Are you doing this all manually or as part of the WGCNA package?

ADD REPLY • link 6.0 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks for the reply! I used R to perform the HCL and SOM clustering. I have not used WCGNA package before. At this stage, am looking for what statistical analysis to perform to find out the most significant co-expressed gene-pair(s) in both HCL and SOM generated data as mentioned in above example. Could you let me know what analyses could be done for such aim? Thanks.

ADD REPLY • link 6.0 years ago by DanielC ▴ 170

0

Entering edit mode

The WGCNA method to do that would be to look at the degree of module membership, or perhaps correlation between the genes and the module eigengene.

ADD REPLY • link 6.0 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks! Could you please let me know what you mean by "module" here? or if you could guide me to an appropriate source. Thanks!

ADD REPLY • link 6.0 years ago by DanielC ▴ 170

1

Entering edit mode

You've essentially created a network graph based on coexpression/covariation. One can subdivide that graph into covarying sections ("modules") that can then correlate to various things such as phenotypes and treatments.

ADD REPLY • link 6.0 years ago by Devon Ryan 104k