How important is filtering reads in EdgeR?
1
0
Entering edit mode
3.5 years ago
M.ap • 0

I have 7 samples (3 control and 4 treated) and the aim is a DEG list. I have read the vignette for edgeR and it strongly recommends filtering using:

keep <- filterByExpr(y, design)

however in my group it was recommended that I don't filter as I have few samples. So by not filtering I would keep as many genes when performing differential expression and not lose any genes. From what I understand it's important to filter out any lowly expressed genes for there to be meaningful results.

Can anyone tell me if I should filter or not?

RNA-Seq edger • 1.1k views
ADD COMMENT
5
Entering edit mode
3.5 years ago
ATpoint 82k

You should because the experts (=the developers of edgeR) recommend it as you say. The function respects the design so it does not matter to have few samples.

It makes even more sense to filter when having few samples (=lower power) as the genes with low counts probably then anyway lack the power to be called significant and it therefore makes sense to remove them to reduce multiple testing burden.

ADD COMMENT
0
Entering edit mode

Splendid, thank you it makes sense. I didn't consider multiple testing burden!

ADD REPLY
1
Entering edit mode

It is notable, I mean if you use for example the GENCODE annotations for mouse you have about 55.000 genes. In my data that usually gets filtered to roughly 20.000 genes when having "normal-sized" experiments with a few groups and 3-6 replicates per group. I am not sure whether the filtering notably affects the dispersion estimation and size factor calculations, would need to check the manual and paper again, but it should be default to do it from what I understand.

ADD REPLY
0
Entering edit mode

Thank you, sorry for the late reply. I'll have a read of the manual and see if I can find anything.

ADD REPLY

Login before adding your answer.

Traffic: 2324 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6