This is quite a broad question, but I hope I will have some interesting answers. I am going to have a large number of RNA-seq & microarray samples approximately 150-300 for a disease phenotype and we expect the gene expression to be very similar between these patients, but there should be certain sub populations that respond to treatment well or not well. I am concerned that although there certainly will be clinically relevant gene expression differences between sub populations detecting these is going to be hard work. For example detecting differential expression with tools like limma/deseq may give few DE genes between sub populations because of substantial noise. I am intending to filter the genes based on variance to counter this problem, but once filtered on variance I am aware I cannot use limma/deseq etc and will need to use a ordinary t test or something similar. This is one tactic to increase statistical power. Does anyone have any good tips for the analysis of this type of data, where we do not expect huge expression differences between subgroups?