Question: Cutoff of reads in htseq
5.6 years ago
European Union
olga.lubinsky0 wrote:


To my understanding the standard number of cutoff for falsely expressed genes in trancriptome is 30 reads.

Am i right? and what is the reason for a this specific cutoff?

Thank you


rna-seq
Can you explain more ? As the heading of question, htseq, there is no min or max cutoff to calculate no. Of read/fragments mapped per genomic location. If you are talking about the differential expression, the cutoff will be based on statistical test but not on any standard number.

If you can provide any source for your understanding, that would be good.

5.6 years ago
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

There is no standard cut-off and there could never be one. The best practice is to use independent filtering.

5.6 years ago
Seville, ES
Martombo2.6k wrote:
there is no standard threshold. you can use the one that is more suitable for your data. in deseq2 you will find an implementation of the independent filtering, suggested by Devon. it basically sets the threshold in a way to get the highest number of differentially expressed calls. check it afterwards, since it could be too high (or too low), depending on your expectations.
I'm not dealing with differentialy expressed gene but with the transcriptome itself. 

I have the reads of each gene , how do i decide wich number of reads is too low to consider as a signficantly expressed gene?

There's no standard for that as well (there never could be). The best you can do is plot the FPKM/RPKMs and see if you have an obviously bimodal distribution. If not, you're left with using the top X% as expressed.

Hello Devon, can i ask you after the observing the bimodal distribution, which steps do you suggest for filtering, im dealing with tpm. Thanks in advance

