Hello,

In my RNA-seq experiment, I applied three treatments: exposure to chemicals A, B, and A+B. My objective is to identify differentially expressed genes (DEGs) in the A+B treatment compared to treatments A or B alone, which serve as controls. My approach involved running edgeR on A vs A+B and B vs A+B, followed by examining the intersection of the results.

Given that I'm employing an intercept approach, where I seek genes differentially expressed in both A vs A+B and B vs A+B, can I justify using a higher than 0.05 p-value threshold for each separate DEG analysis (A vs A+B and B vs A+B)? I want to emphasize that the same A+B reads were used in both comparisons.

Thank you!

You can do anything you want as long as you state in the methods -- whether a reviewer will like it is a different question. These sorts of questions cannot satisfyingly be answered. Given you ask it I assume you get few DEGs at 0.05? Or ignore the pvalues at all and do some sort of meta-analysis to find consistent DEGs. Problem with intersections is that they ignore the fact that for example in dataset-A the pvaj is 0.049 and in the second one 0.051 -- literally the same but a hard cutoff problem. There is no standard answer for this. Just go ahead, try things, report it transparently and then go along with downstream analysis. After all, DEGs are just one step to build a hypothesis that needs some sort of validation.

For an unbiased approach to mitigate the situation above that ATpoint described, you can use Rank Rank Hypergeometric Overlap (RRHO). Quoting from their paper:

Thank you for your response. As you mentioned, I observed a few differentially expressed genes (DEGs) at a significance level of 0.05.

To elaborate on my question, I'm considering the Multiplication Rule of Probability. If we have event A with a probability of 1/2 and event B with the same probability of 1/2, the probability of both events occurring simultaneously is calculated as (1/2)*(1/2) = 1/4.

I'm curious whether a similar principle can be applied in this context. Specifically, when starting with a final p-value of 0.05, can we adjust this threshold using the square root, which would be approximately 0.223? However, I'm uncertain about how the fact that the same A+B reads were utilized in both comparisons might impact this calculation.