Question

P-value Threshold Consideration in Multi-Sample RNA-Seq Experiment

0

Entering edit mode

6 months ago

Netanel • 0

Hello,

In my RNA-seq experiment, I applied three treatments: exposure to chemicals A, B, and A+B. My objective is to identify differentially expressed genes (DEGs) in the A+B treatment compared to treatments A or B alone, which serve as controls. My approach involved running edgeR on A vs A+B and B vs A+B, followed by examining the intersection of the results.

Given that I'm employing an intercept approach, where I seek genes differentially expressed in both A vs A+B and B vs A+B, can I justify using a higher than 0.05 p-value threshold for each separate DEG analysis (A vs A+B and B vs A+B)? I want to emphasize that the same A+B reads were used in both comparisons.

Thank you!

statistics RNA-Seq p-value • 446 views

ADD COMMENT • link updated 6 months ago by Ram 44k • written 6 months ago by Netanel • 0

1

Entering edit mode

You can do anything you want as long as you state in the methods -- whether a reviewer will like it is a different question. These sorts of questions cannot satisfyingly be answered. Given you ask it I assume you get few DEGs at 0.05? Or ignore the pvalues at all and do some sort of meta-analysis to find consistent DEGs. Problem with intersections is that they ignore the fact that for example in dataset-A the pvaj is 0.049 and in the second one 0.051 -- literally the same but a hard cutoff problem. There is no standard answer for this. Just go ahead, try things, report it transparently and then go along with downstream analysis. After all, DEGs are just one step to build a hypothesis that needs some sort of validation.

ADD REPLY • link 6 months ago by ATpoint 84k

1

Entering edit mode

Problem with intersections is that they ignore the fact that for example in dataset-A the pvaj is 0.049 and in the second one 0.051 -- literally the same but a hard cutoff problem.

For an unbiased approach to mitigate the situation above that ATpoint described, you can use Rank Rank Hypergeometric Overlap (RRHO). Quoting from their paper:

Current techniques to compare expression profiles typically involve choosing a fixed differential expression threshold to summarize results, potentially reducing sensitivity to small but concordant changes. We present a threshold-free algorithm called Rank–rank Hypergeometric Overlap (RRHO).

ADD REPLY • link 6 months ago by Haci ▴ 730

0

Entering edit mode

Thank you for your response. As you mentioned, I observed a few differentially expressed genes (DEGs) at a significance level of 0.05.

To elaborate on my question, I'm considering the Multiplication Rule of Probability. If we have event A with a probability of 1/2 and event B with the same probability of 1/2, the probability of both events occurring simultaneously is calculated as (1/2)*(1/2) = 1/4.

I'm curious whether a similar principle can be applied in this context. Specifically, when starting with a final p-value of 0.05, can we adjust this threshold using the square root, which would be approximately 0.223? However, I'm uncertain about how the fact that the same A+B reads were utilized in both comparisons might impact this calculation.

ADD REPLY • link 6 months ago by Netanel • 0