I have a set of RNA-seq data and so far I have prepared my data and the number of raw read counts for each gene for each sample is calculated also I have a matrix in which the columns are samples and rows are genes. now I want to filter out some of the genes to reduce the false positive rate. would you please let me know how I can do the filtering?
actually I have tried "read count per million" and it is calculated for every gene in every sample but I don't know how to determine the best cut off value for that. (for example can I say if the number of read counts of a gene is 2 or less than 2 and it happens in at least 10 sample this gene must be removed?)