Entering edit mode
5.3 years ago
rrbutleriii
▴
260
I have seen in some transcriptome comparison analyses that the total RNA-seq data set is pre-filtered to only include protein-coding genes, or to exclude pseudogenes, or "short RNAs" (Schwartzentruber et al). In a sense I understand this to imply that a transciptome should be the set of mRNA transcripts.
My issue is that I haven't been able to find any RNAseq tutorials that recommend this, nor does the edgeR or Deseq2 vignette. Is there some data on this that I am missing?
Would there be an argument to limit by other characteristics (i.e. a minimum average transcript length, as in short RNAs)?