I am performing an analysis on RNA-Seq data, where only genes relating to specific pathways are of interest to the researcher.
1) One option is normalizing, estimating the dispersion and performing DE with DESeq2 as usual (since DESeq's assumption that most genes are not differentially expressed pertains to the whole set of genes, not to the subset).
Following that, it would be possible to manually select only the relevant subset of genes, and apply FDR only to this specific subset, ( based on the p-values calculated when taking all genes into account).
This is somewhat analogous imho to what independent-filtering does. Independent-filtering after calculating p-values for all genes, subsets the list for only those with mean higher than a certain cuttoff, maximizing that cutoff. The explicitly stated goal is to increase the number of significantly DE genes, the rationale being that genes with genes with low expression are not interesting in the first place.
Here, what defines which genes are interesting is not the mean level, but the inclusion in a specific set.
Would the described process be suitable, and is there another one if not?