I am working with differential gene expression on an affy dataset (2 different cancers). I’ve done the analysis using the limma package with multiple test correction on all probes (over 50k). No problem there.
My question: I want to use around 100 genes (for a certain pathway) to cluster cancers. Can I just pull the expression values from the affy set and do the DE analysis on those OR should I first check all the probes and check if those 100 genes adj.p.value is significant? What if only some of them have adj.p.value under 0.05?
Thanks in advance