I have RNAseq data from 4 developmental conditions (let's say: 1, 2, 3, 4), 3 replicates each. I would like to find genes that are a) differentially expressed in stepwise comparisons but also b) genes that are enriched -and specific- in each stage (1,2,3,4).
For that, I've used deseq2 and after I input all the data together (to get normalisation across all samples),
a) I performed Wald test in pairwise comparisons 1vs2, 2vs3, 3vs4 and used the logFC and padj for stepwise differential expression and
b) I performed LRT test (ddsLRT <- DESeq(dds, test="LRT", reduced= ~ 1) because I don't have more variables e.g. treated vs untreated) and then I also performed Wald test for all the 6 different pairwise comparisons possible (1vs2, 1vs3, 1vs4, 2vs3, 2vs4, 3vs4). From the LRT results, 70%(!) of the genes have a padj<0.05 (I still have not understood how the logFC can be interpreted and used in this case - any input would be highly appreciated). What I find striking is that many of those genes do not show any statistically significant difference in any of the pairwise comparisons and vise versa.
If I use the individual wald tests and the intersections of them to find e.g. genes enriched genes in condition 1 by doing: logFC>1, padj<0.05 in 1vs2 and 1vs3 and 1vs4, I also get genes that do not show a significant padj in the LRT test.
I am struggling to find a way to actually apply the right statistical to find the actual differentially expressed genes that are "enriched"/show a peak in one condition. Any ideas please??