Question: Multiple testing correction issue
0
gravatar for CY
5 weeks ago by
CY330
United States
CY330 wrote:

I am aware of that multiple testing correction is needed when multiple hypothesis tests are simultaneously performed. However, I am a little confused by the word "simultaneously".

Say I have called somatic variants from hundreds of samples. With this sample-variant matrix, I would like to perform pairwise fisher exact test for variant co-occurrence. Do I need to carry out multiple testing correction on the P-value of each paired variants?

I guess multiple testing correction is needed. However, I can consider that each of the pairwise fisher exact test was carried out individually and each raw p-value represents the significance of correponding paired variants, right?

ADD COMMENTlink modified 5 weeks ago by swbarnes25.6k • written 5 weeks ago by CY330

The reason to use multiple testing correction is that with raw p-values, you accept that 5% is false positive (when using p < 0.05 cutoff, the chance that 5 out of 100 is wrong). Therefore if you have thousands of tests (your pairwise independent Fisher exact test), that means your number of false positives are not acceptable anymore (5% of thousands is too much). Read more about it e.g. here.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Benn6.8k

The thing is that I am conducting pairwise fisher exact test individually. Everytime I only select one pair of variants and perform the hypothesis test.

ADD REPLYlink written 5 weeks ago by CY330

How many times do you conduct this test? In other words, how many p-values do you get in total?

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Benn6.8k

I got multiple p-value from multiple hypothesis tests (probably > 100 times). Note though that each test is conducted on different pair of variants. Can I treat them as totally seperate test so that I don't have to apply multiple testing correct?

ADD REPLYlink written 5 weeks ago by CY330
1

No you should use multiple testing correction. Multiple testing correction is exactly meant for cases like yours.

ADD REPLYlink written 5 weeks ago by Benn6.8k

You should edit your question to add these information. And maybe a small example. But rule of thumb is always to correct for multitesting if your perform the same tests multiple times (e.g. perform a differential expression test for each gene ; each gene will "have" a p-value ; then you should correct for multi-testing)

ADD REPLYlink written 5 weeks ago by Nicolas Rosewick7.6k

Here is I am confused about:

for a specific gene showing bare raw significant p-value (would not be significant if correction applied), if I happen to calculate the significance for this gene alone, it still be significant because I don't need to perform correction in this case, right?

ADD REPLYlink written 5 weeks ago by CY330
0
gravatar for shawn.w.foley
5 weeks ago by
shawn.w.foley580
USA
shawn.w.foley580 wrote:

The word "simultaneously" really refers to the same analysis/group of analyses, not that the p-values are dependent on one another. As mentioned in some of the comments it's really a matter of correcting for what could be a large number of false positives. In the example given you're testing for co-occurrence of hundreds of variants, therefore you should correct these p-values because you are performing hundreds of tests. The intention is really to weed out the false positives, although you'll have fewer "significant" hits, you'll have far more confidence in these hits.

ADD COMMENTlink written 5 weeks ago by shawn.w.foley580

Say we have a pair of variants that gives insignificant p-value only after correction is applied. If I happen to calculate this one paired variants alone, it will still be significant because correction is not needed in such case, right? It is seems contradictory to me. We got different result for this very variant pair although nothing changed.

ADD REPLYlink written 5 weeks ago by CY330
1

It's the difference between hypothesis testing versus a discovery experiment. Don't think about it as "significant" becoming "non-significant," think of it as applying different cutoffs based on the experiment.

For an extreme example, let's think about GWAS where you might have millions of SNPs. In that case if say that 95% confidence is sufficient then you'll end up with hundreds of thousands of SNPs that are "significant," but intuitively we know that can't be the case. The odds of >100,000 independent mutations all yielding the same phenotype are astronomical, it doesn't make biological sense. What that says is that 95% isn't enough confidence. In fact, GWAS typically requires 99.999999% confidence (p < 1e-8).

Instead of thinking about multiple testing correction as changing your p-value, think of it as setting a new threshold for significance. An FDR < 0.05 might correspond to p < 0.00001. Therefore if your SNP of interest has a p-value of 0.03 it would be significant if your cutoff was p < 0.05, it would not be significant if your cutoff was p < 0.01, and it is definitely not significant now that you're using a cutoff of p < 0.00001 (aka FDR < 0.05).

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by shawn.w.foley580

Statistics is not easy, I do agree on that for sure. You say:

We got different result for this very variant pair although nothing changed.

But things have in fact changed, namely the number of tests performed. Correction is based on the number of tests you have performed.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Benn6.8k
0
gravatar for swbarnes2
5 weeks ago by
swbarnes25.6k
United States
swbarnes25.6k wrote:

Maybe this will help illustrate the issue

https://xkcd.com/882/

ADD COMMENTlink written 5 weeks ago by swbarnes25.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1149 users visited in the last hour