More results when prefiltering in DESeq2
10 months ago
Stella

Hello community,

I have been using deseq2 for a while now and as is written in the tutorial, prefiltering is only necessary because of computational power since the results function applies the appropriate filtering. I have noticed that doing my analysis both ways, without prefiltering or applying a minimal filtering of 10 counts per row, sometimes I can get a few more genes in the prefiltering scenario. And visualizing them in a heatmap those extra genes have a similar image with the rest, that came out in both cases. So visually, the extra genes do seem differentially expressed to me.

I am a bit troubled on which scenario to trust more. Or if both of them are equally correct and it's just a statistical matter of the amount of genes that you give to the tool and how that affects the padj values of the results.

Many thanks in advance!

the assumption of removing line with < 10 counts is that the low counts gene could be background noise.

In general, filtering gives you results with less sensitivity but more specificity(less false positive) and non-filtering means more sensitivity but less specificity (more false positive).

So the answer of your question depends on what you want in the end.

Ok!Many thanks for your answers, it is clear!


