Question: paired-end, single-end, polyA tail, 500bp, and removal of RNA genes
gravatar for moushengxu
5 weeks ago by
moushengxu300 wrote:

For DEG analysis using RNA-seq, we typically remove pseudogenes, microRNA genes, and RNA genes such as LINC RNA, SCARNA, SNOR, etc., the reason being that the single-end RNA-seq typically uses the polyA tails of RNA to fish out RNA to sequence, and these RNA genes do not have polyA so they should not be there. This sounds fine to me.

My questions arise when paired-end RNA-sequencing is used:

  • If paired-end sequencing uses ~500bp RNA segments, are the polyA tails always in these segments? If not, are the ~500bp segments random chops of the polyA RNA or any RNA?

  • Should we remove the RNA genes as we have done for single-end in DEG analysis?

  • I have genes like RPPH1, RMRP, RN45S, & |MALAT1 high on my DEG list using paired-end alignment, but low on the DEG list using single-end alignment. These are RNA genes, but NOT RNA gene classes such as SNORNA, LINCRNA. Why is it so and should I remove these RNA genes from the DEG analysis or not?

Thanks in advance!

rna-seq next-gen • 167 views
ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by moushengxu300

Note that many lncRNAs are polyadenylated.

ADD REPLYlink written 5 weeks ago by Carlo Yague3.7k

Thanks for the education. However, maybe because their functions are usually unknown, lncRNAs are filtered out in our analysis pipeline. Not sure if this is a good idea.

ADD REPLYlink written 5 weeks ago by moushengxu300
gravatar for Devon Ryan
5 weeks ago by
Devon Ryan77k
Freiburg, Germany
Devon Ryan77k wrote:
  1. The sequence is somewhat randomly distributed throughout the transcript, though there tends to be a bit of bias toward one end or the other.
  2. There's no real point in removing them in either case. You're going to get little if any signal from non-polyadenylated genes, so they'll get removed in filtering at some point mostly.
  3. That things like RN45S are present suggests that something went amiss during poly-A enrichment. In fact, it sounds more like you did ribo-depletion (if you don't do that on fresh material it doesn't work well).
ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by Devon Ryan77k

For point 1, do you refer to paired-end or single-end or both?

For point 3, the treatment was to use shRNA knockdown a RNA binding protein, which might have impact on RNA levels and cause the DE of the said genes. My question is why using paired-end alignment and single-end alignment caused such a big difference (in terms of DEG p-value). Does it have to do with the biological technology and RSEM alignment algorithm?

ADD REPLYlink written 5 weeks ago by moushengxu300
  1. Both, the only difference between SE and PE sequencing is that you sequence both ends of the loaded fragments in the latter.
  2. Maybe mappability, but from your description I wonder if you did ribo-depletion rather than poly-A selection. RSEM might affect things a bit, but I'd be surprised if it's that big of an effect.
ADD REPLYlink written 5 weeks ago by Devon Ryan77k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 978 users visited in the last hour