EdgeR filter thresholds
0
0
Entering edit mode
8.0 years ago
Biogeek ▴ 470

Guys and girls,

When it comes to EdgeR CPM filtering, how does one define a suitable cut-off. I have read that this filter choice is arbitary. The EdgeR vignette says cpm(y)>1 = n. n being the smallest number of samples in a replicate grouping.

I have an experiment with two time points and I'm running the GLM (drug placebo example in the vignette) with a design file and a GLM fit.

3 conditions: control, light and heavy, all 3 conditions on each of the 2 time points, 3 biological reps per group.

I was planning on using cpm(y)>10=6 as I expect genes to appear in at least 6 of the 18 samples. I know EdgeR recommends using cpm(y)>10=3 in my case (3 being smallest rep group). Is there anything wrong with using 6, as each treatment is across 2 time points? MY BCV comes down when I filter, as there seems to be a lot of lowly expressed transcripts in my samples. I am more interested in the tag comparisons which show moderate to large changes in DE.

Thanks for the feedback. I have heard of the genefilter module for R, does anyone know any tutorials on how I can apply it to EdgeR rather than Deseq2.

edger DIFFERENTIAL EXPRESSION • 4.0k views
ADD COMMENT
0
Entering edit mode

There is nothing wrong in using n=6 as long as you are OK with removing transcripts expressed below that threshold. But is it cpm(y)>10 or cpm(y)>1 ?

ADD REPLY
0
Entering edit mode

cpm(y)>2=9 ok still? It brings the BCV down, but retains more transcripts.

ADD REPLY

Login before adding your answer.

Traffic: 2946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6