Dear All,
I performed hierarchical (HCL) and Self-organized map (SOM) clustering on some cancer datasets, to identify the co-expression pattern of the genes involved in the cancer. For instance, I got data like this:
There are 25 datasets from GEO database (like GDSXXX) of microarray mRNA expression values. In each dataset, I looked for the co-expression pattern of a set of gene-pairs using HCL, and got results like this
pair1: gene x<--> gene y (found co-expressed) in all datasets from dataset 1 to dataset 25
Pair2: gene r <--> gene z (found co-expressed) in just 2 datasets among all datasets from dataset 1 to dataset 25
.
.
.
.
Pair 16: gene u <--> gene o (found co-expressed) in all datasets from dataset 1 to dataset 25
Similarly, I performed the co-expression analysis using SOM and got results like this:
pair1: gene x<--> gene y (found co-expressed) in all datasets from dataset 1 to dataset 25
Pair2: gene r <--> gene z (found co-expressed) in just 5 datasets among all datasets from dataset 1 to dataset 25
.
.
.
.
Pair 16: gene u <--> gene o (found co-expressed) in 10 datasets from dataset 1 to dataset 25
Can you please let me know what statistical analysis can I apply here to find out the most significant co-expressed gene-pair(s) in both HCL and SOM? Thanks much,
DK
Are you doing this all manually or as part of the WGCNA package?
Thanks for the reply! I used R to perform the HCL and SOM clustering. I have not used WCGNA package before. At this stage, am looking for what statistical analysis to perform to find out the most significant co-expressed gene-pair(s) in both HCL and SOM generated data as mentioned in above example. Could you let me know what analyses could be done for such aim? Thanks.
The WGCNA method to do that would be to look at the degree of module membership, or perhaps correlation between the genes and the module eigengene.
Thanks! Could you please let me know what you mean by "module" here? or if you could guide me to an appropriate source. Thanks!
You've essentially created a network graph based on coexpression/covariation. One can subdivide that graph into covarying sections ("modules") that can then correlate to various things such as phenotypes and treatments.