EdgeR: apply minimum cpm threshold only to certain samples
1
0
Entering edit mode
2.9 years ago
826afca9 • 0

Hello!

I have a set of RNA-seq data that consists of my samples (uninduced/induced cells) as well as controls (parental cell lines).

I would like to apply a minimum cpm threshold, but only to my samples (3 replicates each), and not to the controls. This is because there is a small subset of genes expressed only in my samples that are currently being removed by the minimum count threshold, since they are not expressed in the controls.

Is there any way of doing this with filterByExpr or the following?

keep <- rowSums( cpm(y) > x ) >=z 

Or should I just separate the data, apply different minimum count thresholds, and then merge it again?

Thanks!

edgeR • 907 views
ADD COMMENT
1
Entering edit mode
2.9 years ago
Gordon Smyth ★ 8.3k

If you are using edgeR for differential expression, then it is an incorrect procedure to apply a filtering threshold only to certain samples. The filtering needs to be independent of differential expression, and that can only be true if the treatment assignment is not used in the filtering.

I'm not entirely sure what the problem is here, because filterByExpr does not require genes to be expressed in the controls. Genes will be kept even if they are expressed only in the induced cells or only in the uninduced cells.

You could simply run filterByExpr with smaller values for min.count and min.count.total. Would that solve the problem?

ADD COMMENT

Login before adding your answer.

Traffic: 2745 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6