Question: Benjamini+ Hochberg multiple testing p.adjust R
1
2.4 years ago by
dp0b60
dp0b60 wrote:

Hi,

Im just trying to get my head around Benjamini + Hochberg multiple testing using the p.adjust approach in R. So is the BH mutliple testing option (pval *no of tests)/rank ?

I have a dummy dataset and bonferroni multiple testing works as should

``````pvals = c(2.335810e-07, 4.820826e-07, 4.820826e-07, 5.807533e-07, 5.807533e-07,6.954857e-07)
pval2=as.numeric(pvals)

[1] 1.401486e-06 2.892496e-06 2.892496e-06 3.484520e-06 3.484520e-06
[6] 4.172914e-06
``````

However, when I try the Benjamini and Hochberg option the adjusted pvalues are all the same

``````BH = p.adjust(pval2, "BH")
[1] 6.954857e-07 6.954857e-07 6.954857e-07 6.954857e-07 6.954857e-07
[6] 6.954857e-07
``````

If there is any advice on what Im doing wrong it would be much appreciated or if anybody knows the exact formula p.adjust uses for BH as well.

Thanks

modified 23 months ago by Renesh1.9k • written 2.4 years ago by dp0b60
2
2.4 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:

is the BH mutliple testing option (pval *no of tests)/rank ?

No, it involves the cumulative minimum. It's most useful to think of BH adjustment as subtracting out a uniform distribution of p-values from your distribution. That won't produce the actual p-values, but it's a more useful visual representation. Alternatively, the adjusted p-value is the excess significance after plotting the p-value vs. its rank.

Given a vector of p-values `p` of length `n`:

``````i = n:1  # The reverse rank order
o <- order(p, decreasing = TRUE)
ro <- order(o)
pmin(1, cummin(n/i * p[o]))[ro]
``````

Coincidentally, that's the exact code that `p.adjust()` uses.

p.adjust does not use your `q`.

I also don't understand your comment on subtracting out a uniform distribution, or the "excess significance".

The adjusted p-value is simply the minimum FDR under which the test is significant.

1

It looks like I had the BY method rather than BH, I'll update the answer to correct that.

Regarding subtracting out a uniform, that's one of the ways of conceptually understanding how BH works.

1
2.4 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche24k wrote:

Have you tried reading the paper ? The procedure is described in section 3. What it says is:
- Order the n p-values (smallest first)
- For a desired false discovery rate threshold alpha, compare p-value p(i) to alpha * i/n
- Find k as the largest i satisfying p(i) <= alpha * i/n
- Reject all null hypotheses p(i) for all i such that p(i)<=p(k)
Alternatively, the corrected p-value can be computed as min(p(i)*n/i, corrected.p(i+1)) for i from n-1 to 1

1
2.4 years ago by
Collin860
United States
Collin860 wrote:

TL;DR There is nothing wrong with the output from the p.adjust function using the "BH" method. If you had more p-values that were not as highly similar, the results would generate different "adjusted p-values" from the "BH" method. You can see the exact calculations performed by p.adjust by simply typing "p.adjust" without parentheses into the R terminal.

What you're observing is a by product of the p.adjust package using a cumulative min when you specify the BH procedure. The original BH procedure (https://www.jstor.org/stable/2346101) doesn't actually specify an adjusted p-value corresponding to each p-value, rather it specifies a criterion to reject the null hypothesis:

``````[Equation 1 in paper]
let k be the largest i for which P(i) <= i/m * q^*
then reject all H(i), i=1,...,k
``````

where P(i) is the i'th smallest p-value in your vector, H(i) is the i'th hypothesis, m is the number of tests, and q^* is the the desired false discovery rate.

Note that, as you point out, q = P(i) / (i / m) is basically the equivalent of an "adjusted p-value" for the BH procedure. Normally you just reject all null hypotheses that have q less than some value. Except there is a problem. q is not necessarily monotonic as you get to larger p-values. This could result in not calling all hypothesis as significant, as specified in the BH procedure to reject all H(i), from i = 1, ..., k. The fix in the p.adjust function was to use a cumulative min so that the rejection of null hypotheses will be the same as specified in the original paper.

0
23 months ago by
Renesh1.9k
United States
Renesh1.9k wrote: