why is there still duplication after extracting aligned read exactly 1 time?
1
0
Entering edit mode
5.7 years ago
star ▴ 350

I have some ChIP seq data that I have done aligning with bowtie2 --very-sensitive --score-min C,0,0 and then extract aligned read exactly 1 time and then remove duplication using Picard tools. I like to know why we have still duplication while I have extracted only aligned read exactly 1 time?

Thanks in advance!

RNA-Seq ChIP-Seq alignment duplicate samtools • 1.3k views
ADD COMMENT
0
Entering edit mode

Do you have an example of read ? Could it be reads aligned in 2 different part of the genome ?

ADD REPLY
2
Entering edit mode
5.7 years ago
ATpoint 81k

You are mixing up vocabulary. Duplication means that the exact same DNA fragment (or at least a fragment with the same start and end coordinate) has been sequenced multiple times. In experiments like ChIP, we typically remove them to avoid counting technical artifacts that would inflate the actual counts within a region. Reads aligning more than once are a completely different thing. It means that a given DNA sequence occurs more than once in the genome. This can happen because the fragment comes from a repetitive region, such as a telomer or centromer, or comes from paralog genes, that share high sequence similarity. Still, using reads only aligning once and removing duplicates is fine, and you may proceed with your analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 2520 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6