I have RNA seq data with the samples defined as follows:
ID level stability type A1 med unstable exp A2 med unstable exp A3 med unstable exp B1 med stable exp B2 med stable exp B3 med stable exp C1 low stable exp C2 low stable exp C3 low stable exp D1 low unstable exp D2 low unstable exp D3 low unstable exp E1 high stable exp E2 high stable exp E3 high stable exp F1 high unstable exp F2 high unstable exp F3 high unstable exp G1 host host host1 G2 host host host1 G3 host host host1 H1 host host host2 H2 host host host2 H3 host host host2 I1 host host host3 I2 host host host3 I3 host host host3
ID column here describes the samples in 3 biological replicates. My objective is to identify DEGs while comparing different samples in pair (A vs B, B vs C, A vs D etc) as well as in group-wise comparison (stable vs unstable, host vs expressed). But when I define a contrast such as "contrast = c("stability","unstable","stable")", I get some DEGs with just one of the replicates from 3 different stable or unstable samples being high or low in comparison to others as well.
As I understand that's because when the program identifies a gene in at least 3 samples within a group of the same profile (irrespective of being just 1 of the replicates of 3 different samples) with significant difference from the rest, it reports it as a DEG. However, I would like to know if there is some parameter that can be introduced to say that more than 3 samples in a group of 9 (may be at least 6 samples in our case) are required to have concordance to be reported as DEG.
If not, then can someone kindly suggest some other way to do avoid getting DEGs not representative of entire group but just 3 samples of a big group.