2.4 years ago by
Vienna - BOKU
So you have calculated RPKM for each gene for each sample. First, I would suggest you to use better normalization techniques, such as TPM ( http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/ ).
To set cutoffs, it always depends on your data. Making a density plot of the TPMs of each sample and comparing them will help you in understanding where the mean is, and therefore how much you lose with certain cutoffs. F.e. with FPKM a common practice was to select only FPKMs >= 1, with TPM as far as I know 1 works but requires a plot beforehand because it's a merely arbitrary number.