I have the gene expression microarray dataset (about 17000 genes) of about 400 cancer samples with different cancer subtypes (say A, B, C, and D) and about 30 control samples, which were collected from the different datasets (meta-analysis). Here, I used only cancer samples and considered 50% of genes with the highest variance as input for WGCNA and selected signed network type. I considered subtypes as traits (binary traits) and used WGCNA to find the possible modules associated with traits and corresponding hub genes. I found that some modules are significantly associated with just subtypes A and B. In the next step, I applied module preservation analysis to examine if the associated modules with subtype A are preserved or non-preserved in other subtypes. So, I considered subtype A as a reference and other subtypes as a test and conducted the preservation analysis for each subtype and the reference, separately. As almost expected, associated modules with subtype A are non-preserved in other subtypes. However, I have some questions in this regard; kindly share with me your suggestions.
- Are the above working steps logical in your view? is it reasonable to do module preservation analysis in the same dataset?
I’m also thinking of doing module preservation analysis with control samples as a reference and each subtype. But, I’m not still sure about it since the sample size of the control is almost small (28 samples) and some modules will be obviously non-preserved between control and each cancer subtype. Please kindly advise me with your helpful comments and suggestion.
- Regarding the module preservation analysis, as I read, the
Zsummaryparameter has a strong dependence on module size, so I used the
medianRankparameter and considered modules with
medianRank ≥ 8as non-preserved modules, is it acceptable?
Thank you in advance