Bonferroni multiple corrections question
1
0
Entering edit mode
4.5 years ago
jevanveen ▴ 20

Hello Biostars!

I have been analyzing some single cell RNA sequencing data, which compares neural transcriptomes from mice treated with either vehicle, or a drug. All the basic stuff is going fine.

When I get my clusters, I find not many super interesting significant drug induced DEGs. If I do GSEA on each cluster, however, things look very interesting, and align with published literature very well. All good so far.

So my problem is this - I use fgsea in R to do my GSEAs, but I am doing the GSEAs on 18 different clusters. That means that my adjusted p values that I get from fgsea are not valid - they need to be corrected to reflect the 18 repeated tests.

If I were to Bonferroni correct those p values, would it be acceptable to take the fgsea output adjusted p values that were already corrected for the number of gene sets tested, and then correct them again? Or is it more appropriate to take the raw p value from fgsea and somehow Bonferroni it for both the number of gene sets tested and also the number of clusters in which I am performing the tests?

Basically, can you Bonferroni an adjp that has already been Bonferronied?

Thanks very much in advance for your insights! Ed

RNA-Seq Statistics scRNA-Seq GSEA • 3.1k views
ADD COMMENT
1
Entering edit mode

To clarify: fgsea adjusted P-values are Benjamini-Hochberg-adgjusted, not Bonferoni.

ADD REPLY
0
Entering edit mode

Hi jevanveen,

I was thinking to run GSEA on my scRNA-seq data, but I'm still debating what would be the input from evert single cluster? How do you project the complexity of the cluster? Are you using the average gene expression from each cluster? Are you sampling few cells from each cluster?

I'd really appreciate your thoughts on that. Thanks!

ADD REPLY
2
Entering edit mode
4.5 years ago
dsull ★ 5.8k

You perform Bonferroni correction by dividing p-values by the total number of tests performed.

If you have 18 clusters and 100 gene sets, divide by 1800 (e.g. your p-value threshold would become 0.05/1800 instead of 0.05).

Yes, you can Bonferroni a Bonferroni-adjusted p-value. Using the example above, if GSEA already multiplies by 100 for you, then all you have to do is multiply by 18, which is mathematically equivalent to multiplying the raw p-values by 1800 (or, equivalently, dividing your p-value threshold by 1800).

Note: If you don't see any cluster being significant in any gene set, that doesn't mean an enrichment doesn't exist. Bonferroni is known for low statistical power (you're likely to get false negatives). The family-wise error rate [FWER] (which Bonferroni controls for) merely gives you the probability of making at least one false positive (e.g. Bonferroni assures that the probability of making at least one false positive is <= 0.05, assuming an alpha of 0.05). Even for controlling the FWER, there are more powerful methods that still control FWER, such as the Holm-Bonferroni.

ADD COMMENT
0
Entering edit mode

This is fantastically helpful and straight to the point. Thank you very much!!!

ADD REPLY

Login before adding your answer.

Traffic: 1957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6