Question

Doesn'T It Make Sense To Reduce Multiple Testing By Being More Selective With Genes Investigated?

9

Entering edit mode

10.9 years ago

jobinv ★ 1.1k

When working with a microarray containing tens of thousands of probes, it makes sense that multiple testing is an issue. I also understand that it is common to perform multiple testing correction (for example with Bonferroni or Benjamini-Hochberg), where the more genes you are testing, the more stringent this becomes.

However, let's say that I have a particular microarray study that was designed to answer a question specifically about genes related to metabolism and how they are affected between two test conditions. In such a case it makes sense to me to first remove all genes from my expression matrix that are not related to metabolism (I'm not immediately sure how one would go about doing this, but I imagine it must be possible?), before performing statistical analysis and subsequent multiple testing correction. This would, I guess, allow for easier detection of results that are relevant to my particular research question.

My question: is this thinking correct? And if so, why is it not done more frequently?

microarray • 3.5k views

ADD COMMENT • link updated 10.9 years ago by brentp 24k • written 10.9 years ago by jobinv ★ 1.1k

score 8 · Answer 1 · 2013-06-13

8

Entering edit mode

10.9 years ago

brentp 24k

Yes

There's a nice paper on this by Bourgon, Gentleman, and Huber:

http://www.ncbi.nlm.nih.gov/pubmed/20460310

They show that a "two-stage approach that first filters variables by a criterion independent of the test statistic, and then only tests variables which pass the filter, can provide higher power." And have some examples of filtering that can introduce bias.

A common way to filter is to probes with the highest variance (remove low variance probes).

ADD COMMENT • link 10.9 years ago by brentp 24k

1

Entering edit mode

One thing I forgot until looking at the paper today is that removing probes with the lowest variance will mess up limma. So that's something to keep in mind when doing variance-based filtering.

ADD REPLY • link 10.9 years ago by brentp 24k

0

Entering edit mode

thanks for the link, this is an approach that I have been advocating a lot but mostly on gut feeling + common sense but it is great to have something more meaningful to back it up

ADD REPLY • link 10.9 years ago by Istvan Albert 100k

0

Entering edit mode

I had also been thinking about this recently in the context of both Microarrays and RNA-Seq experiments after some discussions with colleagues. My impression was that it would obviously increase power, but you had to be very careful not to introduce experimenter bias in to the equation. One of the benefits of not filtering is you aren't introducing any bias in to the system, and may discover significant and unexpected results. I would probably do both a filtered and unfiltered analysis if I opted to do this.

ADD REPLY • link 10.9 years ago by DG 7.3k

score 5 · Answer 2 · 2013-06-13

5

Entering edit mode

10.9 years ago

Eric Normandeau 11k

If you were to build a custom microarray to answer your question about which metabolism genes respond to your testing conditions, this is exactly what you would do, and I think it would be fine.

The problem is if you don't find anything or much regarding metabolism and then decide to go and look at stress response instead, and then maybe yet another pathway. Then the assumptions for multiple testing would not hold.

ADD COMMENT • link 10.9 years ago by Eric Normandeau 11k

0

Entering edit mode

Right, I see, so it would limit future questions we could ask of the material, in other words. I agree, that's an obvious flaw with this reasoning.

ADD REPLY • link 10.9 years ago by jobinv ★ 1.1k

score 4 · Answer 3 · 2013-06-13

4

Entering edit mode

10.9 years ago

Devon Ryan 104k

Sure, if you're a priori only interested in a certain subset of genes/transcripts/probes, then you don't benefit from testing everything else. Many of us aren't a priori interested in a certain limited subgroup when doing microarray or RNAseq (otherwise, you might just use qPCR) experiments, so that's probably why you don't see this done as often.

I should also point out the genefilter package on bioconductor, for other uses of filtering that are unrelated to your immediate question.

ADD COMMENT • link 10.9 years ago by Devon Ryan 104k

0

Entering edit mode

I'm not familiar with the genefilter package, I will have to look that up. Thanks!

ADD REPLY • link 10.9 years ago by jobinv ★ 1.1k

score 2 · Answer 4 · 2013-06-13

2

Entering edit mode

10.9 years ago

Asaf 10k

Let's take this approach to the extreme, if I'm only interested in one gene and instead of testing just this gene I perform a microarray experiment, will it be OK to look only on that gene? I think not because the way microarrays are treated and normalized, I don't know how it applies for a group of genes.
You might be correct but if I'll read a line like "we chose metabolism related genes ..." in a paper I will automatically think - they took the genes that showed an effect and did the correction on them. So it might be concise but it smells bad.

ADD COMMENT • link 10.9 years ago by Asaf 10k

0

Entering edit mode

The normalization effect is something I had not considered. That is a good point, yes. And yeah, I can see how it could look suspicious :)

ADD REPLY • link 10.9 years ago by jobinv ★ 1.1k