I would like to run network analysis using wgcna on my microarray dataset including 700 samples and 22000 genes. I have 2 questions: 1) The dataset has a batch effect, so that using flashClust(), resulted in 2 discrete groups in patients and 2 discrete groups in controls. I already have read the viewpoints of Kevin Blighe and ivivek_ngs (Batch effects : ComBat or removebatcheffects (limma package) ?) about removing batch effect. Is it reasonable that I analyze a group as training and the other as validation? 2) I've processed the data using neqc(), following by filtering them. Which method is better for selecting a set of genes for analysis. Coefficient of variation or MAD (median absolute deviation)?

