Question: Adjusted P-value in GO analysis
0
9 months ago by
DNA0
DNA0 wrote:

Hi!

I am doing GO analysis using software such as EnrichR, and all results come out with good P values but the adjusted P values are around 0.5 or even around 1. Is it a glitch or it is expected to be like this, and are the results then statistically significant?

Thanks!

go gene ontology p value • 986 views
modified 9 months ago by EagleEye5.8k • written 9 months ago by DNA0
0
9 months ago by
Philipp Bayer5.7k
Australia/Perth/UWA
Philipp Bayer5.7k wrote:

Since a GO enrichment analysis runs many, many, many tests (either one test for each gene or one test for each GO-term, can't remember) just by chance alone you would expect to see terms that seemingly have a significant p-value

Let's say you set your p-value cutoff of 0.05. You run one test and your p-value is below 0.05, nothing of significance. You keep on repeating tests, by test 20 you have a significant p-value, hooray! However, with a p-value cutoff of 0.05, you expect to see one significant test purely by chance after 20 tests (20*0.05 = 1). See slide 4 here for another example.

That's why software that runs many many tests runs multiple test correction to get around this problem, it adjusts the p-values based on the number of tests you ran (~the size of your input dataset).

In your case I would ignore the column of the unadjusted p-value and just look at the adjusted p-values, if they're not <0.05 there then they are not significant.

0
9 months ago by
theobroma221.1k
theobroma221.1k wrote:

You can use the P-value...although the change is not really expected as it’s dependent on the mathematics underlying the adj. p-value calculation. Essentially, you should’ve, well based on Bayes, defined your cut-off value prior to doing the calculation. Anyway, if you should be required to use the adj. p-value, then typically the 0.05 is the standard cut-off. Using this metric assures there are less false positives compare to the P-value.

0
9 months ago by
EagleEye5.8k
Sweden
EagleEye5.8k wrote:

Though what Philipp Bayer saying is true, the p-value metric is still used to filter the enriched process. In addition to that I would also recommend you to consider filtering by percentage of genes covered from particular pathway/biologicalProcess by your input list.

If you cannot find the information (% genes covered in pathway) from EnrichR, try GeneSCF.

A: GeneSCF and P-value Benjamini and Hochberg (FDR)