Setting a threshold for FDR correction
1
2
Entering edit mode
5.4 years ago

Hello, I am studying correction for multiple testing and have some trouble understanding the FDR.

• significance level=0.05
• number of tests=100

For Bonferroni correction, p-value should be multiplied by the number of tests. So, the raw p-value(0.001) * number of tests(100) = Bonferroni corrected p-value(0.1). So we can say that this p-value is not significant (when alpha=0.05).

Or, we can set the cutoff for Bonferroni correction as 0.05/100 = 0.0005. So the p-value of 0.001 is not significant, as this 0.001 is greater than 0.0005.

For FDR correction, p-value is calculated as: p-value * rank/number of tests. If this p-value ranks fifth among 100 tests, raw p-value(0.001) * 5/100 = FDR corrected p-value(0.00005). So we can say that this p-value is still significant.

My question is, then how can I set the cutoff for FDR correction? (as 0.05/100 in Bonferroni correction above)

Multiple testing False discovery rate FDR • 14k views
0
Entering edit mode
5.4 years ago
russhh 5.7k

Your description of FDR correction isn't right. If you've tested M (independent [1]) hypotheses, and your gene has raw-p-value of P and rank I, then it's FDR adjusted p-value is PI/R ([2]). The way you described it, the FDR-adjusted p-value would always be lower than the raw p-value (which wouldn't make sense).

[1] - let's face it, genes don't behave independently, so it's a bit strange for us to treat hypotheses over gene expression to be independent, but that's a conversation for another day

[2] - there is a slightly different way of computing FDR_adjusted p-value: "The adjusted P value for a test is either the raw P value times m/i or the adjusted P value for the next higher raw P value, whichever is smaller"; for more details see http://www.biostathandbook.com/multiplecomparisons.html

0
Entering edit mode

Thanks for your kind answer and correction. What I wondered was how to set a threshold for the FDR correction. As I understand, the cutoff should be determined based on the all p-values, not by just the number of tests as in the Bonferroni correction. The largest FDR-corrected p-value <=0.05 is 0.05 (Protein, ranks 5) in the example data from the link you wrote, so the cutoff value for FDR correction should be 0.042 (raw p-value of Protein). Am I getting it right? Well, this is my guess, but not confident with my answer. Thanks for your help, russhh!

1
Entering edit mode

I would recommend the excellent videos by Josh Starmer at UNC called "StatQuest". I find them to be very helpful and concise. Maybe this would be of use?

StatQuest: FDR and the Benjamini-Hochberg Method clearly explained