So I recently performed total RNA-seq using the Nugen SoLo library prep kit. My sample was a pool of sorted neuronal nuclei from mouse. Our informatics core has aligned the sequencing data using STAR and found that about 20% is exonic, and 60-70% is intronic. Now, that is to be expected, I suppose, since this is a total RNA-seq prep and I'm using nuclear RNA as input. But now my question is, how do you handle the intronic data?
One of my goals is differential gene expression. Would you combine the intronic reads with the exonic reads, collapsing to the gene level, or would you analyze both separately, or would you disregard the introns (which would be kind of a waste). What do you think? And do you have some published analysis methods that deal with this kind of situation? I see lots of total RNA-seq prep kits now on the market, so I can't be the first to encounter this issue.
What sort of relative abundances are you seeing between introns and their flanking exons?
Not sure, but I will check. Good question.