Question: Filtering of lncRNAs and mRNAs for Differential expression analysis from RNA-seq data
I wanted to perform DE of lncRNAs and mRNAs from the RNA-seq data using DEseq2. I am wondering if it is a good idea to have separate annotation files for each and then perform the differential expression (DE) or is it OK to perform DE analysis using the annotation files that contains both lncRNAs as well as protein-coding mRNAs and then filter based off the gene-biotype?

rna-seq • 128 views
My preference is to keep them separate and to conduct the analysis separately. Protein coding mRNAs are typically expressed at higher levels than ncRNAs, with many ncRNAs being expressed at very low values (if at all, in your tissue of interest). For that reason, I feel that normalising them together would in fact bias the results, but I don't have data to back up this hypothesis.

I would be interested to hear other ideas.

It will be a good idea to process them separately starting from alignment to DE. The following research article suggests a similar approach.

Thanks all for your responses. I do agree that performing lncRNA and mRNA analysis separately is a better approach compared to processing them altogether.

The only reason to process them separately is if the lncRNAs are so lowly expressed that they keep getting excluded due to independent filtering. If that does happen to be the case, then perform everything up to and including computation of the size factors with the lncRNAs mixed with the other genes (this is (1) to ensure that the size factors remain constant and (2) to ensure robustness). Thereafter you can split and test separately.

This post will give you some idea,

A: Any One please provide protocol for Analysing long noncoding RNA illumina NGS da

I would recommend you to perform DE analysis separately (since protein coding expression might create biasness due to its abundance).

