Question: Bonferroni multiple corrections question
gravatar for jevanveen
13 months ago by
jevanveen20 wrote:

Hello Biostars!

I have been analyzing some single cell RNA sequencing data, which compares neural transcriptomes from mice treated with either vehicle, or a drug. All the basic stuff is going fine.

When I get my clusters, I find not many super interesting significant drug induced DEGs. If I do GSEA on each cluster, however, things look very interesting, and align with published literature very well. All good so far.

So my problem is this - I use fgsea in R to do my GSEAs, but I am doing the GSEAs on 18 different clusters. That means that my adjusted p values that I get from fgsea are not valid - they need to be corrected to reflect the 18 repeated tests.

If I were to Bonferroni correct those p values, would it be acceptable to take the fgsea output adjusted p values that were already corrected for the number of gene sets tested, and then correct them again? Or is it more appropriate to take the raw p value from fgsea and somehow Bonferroni it for both the number of gene sets tested and also the number of clusters in which I am performing the tests?

Basically, can you Bonferroni an adjp that has already been Bonferronied?

Thanks very much in advance for your insights! Ed

ADD COMMENTlink modified 13 months ago by dsull1.6k • written 13 months ago by jevanveen20

To clarify: fgsea adjusted P-values are Benjamini-Hochberg-adgjusted, not Bonferoni.

ADD REPLYlink written 13 months ago by alserg590

Hi jevanveen,

I was thinking to run GSEA on my scRNA-seq data, but I'm still debating what would be the input from evert single cluster? How do you project the complexity of the cluster? Are you using the average gene expression from each cluster? Are you sampling few cells from each cluster?

I'd really appreciate your thoughts on that. Thanks!

ADD REPLYlink written 13 months ago by kathy.ushakov0
gravatar for dsull
13 months ago by
dsull1.6k wrote:

You perform Bonferroni correction by dividing p-values by the total number of tests performed.

If you have 18 clusters and 100 gene sets, divide by 1800 (e.g. your p-value threshold would become 0.05/1800 instead of 0.05).

Yes, you can Bonferroni a Bonferroni-adjusted p-value. Using the example above, if GSEA already multiplies by 100 for you, then all you have to do is multiply by 18, which is mathematically equivalent to multiplying the raw p-values by 1800 (or, equivalently, dividing your p-value threshold by 1800).

Note: If you don't see any cluster being significant in any gene set, that doesn't mean an enrichment doesn't exist. Bonferroni is known for low statistical power (you're likely to get false negatives). The family-wise error rate [FWER] (which Bonferroni controls for) merely gives you the probability of making at least one false positive (e.g. Bonferroni assures that the probability of making at least one false positive is <= 0.05, assuming an alpha of 0.05). Even for controlling the FWER, there are more powerful methods that still control FWER, such as the Holm-Bonferroni.

ADD COMMENTlink written 13 months ago by dsull1.6k

This is fantastically helpful and straight to the point. Thank you very much!!!

ADD REPLYlink written 13 months ago by jevanveen20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2174 users visited in the last hour