Entering edit mode
5.7 years ago
Assa Yeroslaviz
★
1.9k
I know I can filter my counts matrix using this command
filtered.counts <- counts[rowSums(counts==0)<3, ]
when I would like to keep genes with counts in more than three samples.
But is there a way to do the same and removes rows from the matrix when this three 0 are in only one condition?
I have 2 conditions with each four replica. I would like to filter for genes with counts in at least two of them.
Would this kind of filtering make sense? Or do I create a bias in the expression matrix?
thanks Assa
You could simply use something like
FilterByExprfromedgeR.I would keep the rows (genes) if one condition has all zeros while the rest having non-zero values. Depending on the sequencing depth across different samples/conditions, this gene might simply be under-/over-represented in one condition vs others. And yes, sample-specific filtering might result in biases in the downstream steps.