I carried out differential expression analyses only on protein-coding genes because that is what I was interested in. Now after a while, I am considering looking into lncRNAs as well. Should I run DGE on just the lncRNAs or combine protein-coding and lncRNAs and rerun it together?
The number of genes analysed changes a lot in both these situations. This would also affect multiple testing correction.
Can you provide more details? You need to generate countmatrix of entire gene set and perform DGE analysis on that. You can shortlist genes of your interest (protein coding/non-coding) afterwards.
My choice of whether to run DGE on protein-coding genes/lncRNAs separately or together will change the total number of analysed features in my dataset. This will consequently change the stringency of multiple testing correction. Does this matter? My resulting DEG list based on p-values may be same in both cases, but a filtered list of DEGs based on q-values may be different.