I'm busy with an RNA-seq analysis of case and control samples and got some differentially expressed coding genes and long non-coding RNA (lncRNA) by edgeR. I would like to do an integrative lncRNA-mRNA analysis; the library size of cases is small (about 2 million raw counts) compared to controls (about 60 million raw counts), so I filtered the genes with CPM value of less than 5 during the edgeR analysis. Given that the lncRNAs usually have the low expression value, I'm concerned about the CPM threshold as some lncRNA may miss during the analysis. Could you please share your idea about the analysis?
Thanks a lot