Doesn'T It Make Sense To Reduce Multiple Testing By Being More Selective With Genes Investigated?
4
9
Entering edit mode
10.9 years ago
jobinv ★ 1.1k

When working with a microarray containing tens of thousands of probes, it makes sense that multiple testing is an issue. I also understand that it is common to perform multiple testing correction (for example with Bonferroni or Benjamini-Hochberg), where the more genes you are testing, the more stringent this becomes.

However, let's say that I have a particular microarray study that was designed to answer a question specifically about genes related to metabolism and how they are affected between two test conditions. In such a case it makes sense to me to first remove all genes from my expression matrix that are not related to metabolism (I'm not immediately sure how one would go about doing this, but I imagine it must be possible?), before performing statistical analysis and subsequent multiple testing correction. This would, I guess, allow for easier detection of results that are relevant to my particular research question.

My question: is this thinking correct? And if so, why is it not done more frequently?

microarray • 3.5k views
ADD COMMENT
8
Entering edit mode
10.9 years ago
brentp 24k

Yes

There's a nice paper on this by Bourgon, Gentleman, and Huber:

http://www.ncbi.nlm.nih.gov/pubmed/20460310

They show that a "two-stage approach that first filters variables by a criterion independent of the test statistic, and then only tests variables which pass the filter, can provide higher power." And have some examples of filtering that can introduce bias.

A common way to filter is to probes with the highest variance (remove low variance probes).

ADD COMMENT
1
Entering edit mode

One thing I forgot until looking at the paper today is that removing probes with the lowest variance will mess up limma. So that's something to keep in mind when doing variance-based filtering.

ADD REPLY
0
Entering edit mode

thanks for the link, this is an approach that I have been advocating a lot but mostly on gut feeling + common sense but it is great to have something more meaningful to back it up

ADD REPLY
0
Entering edit mode

I had also been thinking about this recently in the context of both Microarrays and RNA-Seq experiments after some discussions with colleagues. My impression was that it would obviously increase power, but you had to be very careful not to introduce experimenter bias in to the equation. One of the benefits of not filtering is you aren't introducing any bias in to the system, and may discover significant and unexpected results. I would probably do both a filtered and unfiltered analysis if I opted to do this.

ADD REPLY
5
Entering edit mode
10.9 years ago

If you were to build a custom microarray to answer your question about which metabolism genes respond to your testing conditions, this is exactly what you would do, and I think it would be fine.

The problem is if you don't find anything or much regarding metabolism and then decide to go and look at stress response instead, and then maybe yet another pathway. Then the assumptions for multiple testing would not hold.

ADD COMMENT
0
Entering edit mode

Right, I see, so it would limit future questions we could ask of the material, in other words. I agree, that's an obvious flaw with this reasoning.

ADD REPLY
4
Entering edit mode
10.9 years ago

Sure, if you're a priori only interested in a certain subset of genes/transcripts/probes, then you don't benefit from testing everything else. Many of us aren't a priori interested in a certain limited subgroup when doing microarray or RNAseq (otherwise, you might just use qPCR) experiments, so that's probably why you don't see this done as often.

I should also point out the genefilter package on bioconductor, for other uses of filtering that are unrelated to your immediate question.

ADD COMMENT
0
Entering edit mode

I'm not familiar with the genefilter package, I will have to look that up. Thanks!

ADD REPLY
2
Entering edit mode
10.9 years ago
Asaf 10k
  1. Let's take this approach to the extreme, if I'm only interested in one gene and instead of testing just this gene I perform a microarray experiment, will it be OK to look only on that gene? I think not because the way microarrays are treated and normalized, I don't know how it applies for a group of genes.
  2. You might be correct but if I'll read a line like "we chose metabolism related genes ..." in a paper I will automatically think - they took the genes that showed an effect and did the correction on them. So it might be concise but it smells bad.
ADD COMMENT
0
Entering edit mode

The normalization effect is something I had not considered. That is a good point, yes. And yeah, I can see how it could look suspicious :)

ADD REPLY

Login before adding your answer.

Traffic: 1437 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6