Why don't we apply multiple hypothesis testing correction for multiple bouts of differential expression analysis?
1
0
Entering edit mode
3.4 years ago

Hello all,

I've seen people recommend repeating DESeq analysis if you want to do study multiple contrasts for a single factor, for example: A: DESeq2 compare all levels

However, in the above, aren't you rerunning hypothesis tests repeatedly? Why isn't there an additional step of p-value adjustment after you finish running all the tests (each of which has its own p-value adjustment to account for within-test multiple hypothesis testing correction)?

Is there something different about (running 5 different tests on 5 different genes) and (running 2 tests of 5 genes apiece)? Should we as biologists refrain from using an individual test's adjusted pvalue, and instead run multiple hypothesis testing correcting on all the unadjusted p-values in all the test, taking into account that multiple tests have been performed?

Also, does the above impact whether or not to examine intersections of gene lists? For example, say your model was:

design=~condition

and the conditions were Control, treatment1, treatment2, treatment3. You then make the following contrasts:

results1 <- results(dds, contrast=c("condition", "Control", "treatment1"))
results2 <- results(dds, contrast=c("condition", "Control", "treatment2"))
results3 <- results(dds, contrast=c("condition", "Control", "treatment3"))

Let's say you want to understand what genes are statistically significantly unregulated in treatment1 as well as treatment2. Would it be appropriate to look at the intersection of results1 vs. results2 to address that question? Or is that bad form because each DE gene list was generated assuming statistical significance for that specific contrast alone? Or - would looking at the intersection actually help understand the "true DE genes" in results1 and results2 by eliminating genes erroneously called DE due to the results of multiple hypothesis testing?

Thank you very much for your help!

RNA-Seq • 1.2k views
ADD COMMENT
0
Entering edit mode
3 months ago
ivingan • 0

I currently have the same question. I do not know the answer, but my thesis work relies heavily on multiple bouts of analyses like this.

I found this recent article which makes the argument that it is a necessary step and provides an algorithm too do so: Lathan Liou, Milena Hornburg, David S Robertson, Global FDR control across multiple RNAseq experiments, Bioinformatics, Volume 39, Issue 1, January 2023, btac718

ADD COMMENT

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6