I have a microarray expriment (Affymetrix) with 50 samples (each belonging to one of 5 disease subtypes) and about 5000 genes after filtering. Now, I would like to find a list of, let's say, 20 genes whose expression can be used to correctly classify the disease subtypes.
For now, I have created a design matrix in R that defines which subtype a sample belongs to. I then create a contrast matrix to distinguish between class A and the other classes, use eBayes to get the p-values, adjust them for multiple testing:
contrast.matrix = makeContrasts(A-(B+C+D+E)/4, levels=design) fit = eBayes(contrasts.fit(lmFit(myFilteredGenes, design), contrast.matrix))
This is repeated for each class, so I end up with a 5000x5 matrix of p-values (where one column corresponds to how significant the current class expression compared to the average of all others is). How can I use this matrix to extract my class-discriminant genes? Or [preferred] how can I apply statistical testing on all classes instead of just 2 at a time?
Note: this is (1) for all classes at once (I'm not searching for pairwise comparison) and (2) taking the 4 highest p-values for each class might not be the smartest approach (or is it?).