Question: why is there still duplication after extracting aligned read exactly 1 time?
gravatar for star
24 months ago by
star240 wrote:

I have some ChIP seq data that I have done aligning with bowtie2 --very-sensitive --score-min C,0,0 and then extract aligned read exactly 1 time and then remove duplication using Picard tools. I like to know why we have still duplication while I have extracted only aligned read exactly 1 time?

Thanks in advance!

ADD COMMENTlink modified 24 months ago by ATpoint36k • written 24 months ago by star240

Do you have an example of read ? Could it be reads aligned in 2 different part of the genome ?

ADD REPLYlink written 24 months ago by Titus910
gravatar for ATpoint
24 months ago by
ATpoint36k wrote:

You are mixing up vocabulary. Duplication means that the exact same DNA fragment (or at least a fragment with the same start and end coordinate) has been sequenced multiple times. In experiments like ChIP, we typically remove them to avoid counting technical artifacts that would inflate the actual counts within a region. Reads aligning more than once are a completely different thing. It means that a given DNA sequence occurs more than once in the genome. This can happen because the fragment comes from a repetitive region, such as a telomer or centromer, or comes from paralog genes, that share high sequence similarity. Still, using reads only aligning once and removing duplicates is fine, and you may proceed with your analysis.

ADD COMMENTlink modified 20 months ago • written 24 months ago by ATpoint36k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1341 users visited in the last hour