Question

Filtering during RNA Seq analysis?

0

Entering edit mode

7.7 years ago

aggregatibacter ▴ 180

Hi,

I have roughly 100 samples (2 groups) of noisy human tissue data. Thinking about filtering to improve the multiple testing situation. Thus far, I filter by cpm values (calculated using edgeR after counting with featurecounts), and run voom/limma afterwards. I have also stumbled over htsfilter, and thought about variance filtering.

Now I heard that filtering, at least by variance or htsfilter, was per se incompatible with limma or other empirial Bayes methods.

What do you experts think/suggest?

Many thanks!

RNA-Seq • 3.1k views

ADD COMMENT • link updated 7.7 years ago by i.sudbery 19k • written 7.7 years ago by aggregatibacter ▴ 180

score 4 · Answer 1 · 2016-08-17

You might also try "Independent Hypothesis Weighting" as implemented in the IHW Bioconductor package and described it this paper: http://www.ncbi.nlm.nih.gov/pubmed/27240256

Instead of filtering genes to conserve power, it weights them. It is compatible at least with DESeq2 and is used in their vignette (i suspect it can be used in conjunction with other packages as well).

score 3 · Answer 2 · 2016-08-16

3

Entering edit mode

7.7 years ago

Devon Ryan 104k

The package you're looking for is genefilter. As an aside, that's used automatically by DESeq2.

BTW, filtering by variance isn't compatible with any RNAseq package that I'm aware of (variance is convoluted with the DE test).

ADD COMMENT • link 7.7 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks guys for your suggestions. I will definitely also try DESeq2. I have the feeling that it is quite en vogue at the moment, is that correct? Thus far, I primarily use limma, because I was brought up with arrays, and usually need to correct for random and fixed factors.

ADD REPLY • link 7.7 years ago by aggregatibacter ▴ 180