Reads with high duplication levels and biased gc content

0

Entering edit mode

8.4 years ago

mjg ▴ 30

Hi everyone,

I'm working with RNAseq data obtained with a total RNA protocol. It is paired end, and 75 bases long. There is a control and a disease group.

The disease group shows a very consistent (and biased) pattern across samples in terms of GC content and duplication levels (sequences having up to >10,000 K copies). (See images below)

Has anyone seen something similar and found a way on how to explain the issue?

Thanks,
Maria

RNA-Seq gc-content duplication bias • 1.9k views

ADD COMMENT • link updated 20 months ago by Ram 43k • written 8.4 years ago by mjg ▴ 30

0

Entering edit mode

If the bias is consistent within the group and doesn't correspond to a batch effect of some sort then perhaps it's real. A high duplication rate is expected in any RNAseq experiment and it's typically very very high in total RNAseq experiments (even if you attempt ribo depletion, you still end up sequencing >10% rRNA half the time).

ADD REPLY • link 8.4 years ago by Devon Ryan 104k

Login before adding your answer.