Hello,
I've analyzed my RNA-seq data using DESeq2 and now I am trying to see whether some gene of interest are expressed in our dataset. I found this paper which uses RPKM cut off of 0.3 to defined genes as expressed. I found out that I can use the fpkm function in DESeq2 to convert the counts to fpkm/rpkm. However, reading some posts in here, it seems that rlog or vst normalized counts are recommended instead. Are there any common rule of thumb for the rlog or vst-normalized counts to define expressed genes? Thanks in advance!
These are recommended when you need to compare gene expression between conditions. To see whether a gene is expressed within a single condition, RPKM/FPKM are fine IMHO.
Why don't you convert them to TPM and use a global threshold? I noticed that "When you use TPM, the sum of all TPMs in each sample are the same. This makes it easier to compare the proportion of reads that mapped to a gene in each sample.", when reading this blog post.
Previous discussion about defining expression cutoffs: TPM values of expressed genes