I have 7 samples (3 control and 4 treated) and the aim is a DEG list. I have read the vignette for edgeR and it strongly recommends filtering using:
keep <- filterByExpr(y, design)
however in my group it was recommended that I don't filter as I have few samples. So by not filtering I would keep as many genes when performing differential expression and not lose any genes. From what I understand it's important to filter out any lowly expressed genes for there to be meaningful results.
Can anyone tell me if I should filter or not?
Splendid, thank you it makes sense. I didn't consider multiple testing burden!
It is notable, I mean if you use for example the GENCODE annotations for mouse you have about 55.000 genes. In my data that usually gets filtered to roughly 20.000 genes when having "normal-sized" experiments with a few groups and 3-6 replicates per group. I am not sure whether the filtering notably affects the dispersion estimation and size factor calculations, would need to check the manual and paper again, but it should be default to do it from what I understand.
Thank you, sorry for the late reply. I'll have a read of the manual and see if I can find anything.