Question: Reads with high duplication levels and biased gc content
gravatar for mjg
4.3 years ago by
United Kingdom
mjg20 wrote:

Hi everyone,

I'm working with RNAseq data obtained with a total RNA protocol. It is paired end, and 75 bases long. There is a control and a disease group.

The disease group shows a very consistent (and biased) pattern across samples in terms of GC content and duplication levels (sequences having up to >10,000 K copies). (See images below)

Has anyone seen something similar and found a way on how to explain the issue?





ADD COMMENTlink modified 4.2 years ago by Biostar ♦♦ 20 • written 4.3 years ago by mjg20

If the bias is consistent within the group and doesn't correspond to a batch effect of some sort then perhaps it's real. A high duplication rate is expected in any RNAseq experiment and it's typically very very high in total RNAseq experiments (even if you attempt ribo depletion, you still end up sequencing >10% rRNA half the time).

ADD REPLYlink written 4.3 years ago by Devon Ryan94k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1168 users visited in the last hour