Why to remove lowly expressed genes in RNAseq
1
1
Entering edit mode
5.9 years ago
Ankit ▴ 500

Hi all,

I have a question regarding the filtering of lowly expressed genes in analysis of RNAseq data. What I understood from literature is that these low counts genes are basically a noise and not a true picture of differentially expressed genes, so they need to be removed if very low counts are observed for all the samples (not only in one sample).

I wonder if there is any other basis of it, mainly in terms of :

1). Differential expression

2). Statistical

3). any Molecular / Biological

I appreciate any suggestion.

Thanks

Ankit

RNA-Seq • 4.8k views
ADD COMMENT
0
Entering edit mode

StatQuest gives some nice details on this.

ADD REPLY
2
Entering edit mode
5.9 years ago

The main reason behind the idea of discarding low count genes is to NOT test genes for which we presume a difference in expression would not be relevant. If you do less tests, then the correction for multiple testing becomes less stringent, and the overal power of your experiment increases. This concept, called 'independent filtering' is well explained in this publication: Independent filtering increases detection power for high-throughput experiments

ADD COMMENT
0
Entering edit mode

Thanks Carlo for the suggestion.

I have one more question after reading this.

How the low expressed gene (or zero counts) is related with type 1 error and false discovery rate. Is there a mathematical correlation?

Please suggest.

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 3225 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6