Question: Definition of PCR duplicates based on alignment coordinates
gravatar for abascalfederico
3.2 years ago by
abascalfederico1.1k wrote:

Dear all,

I have to identify PCR duplicates by myself and would like to understand how tools like Picard's MarkDuplicates and Samtools' rmdup define them.

Do they require that the beginning and end alignment coordinates are the same? I was thinking that since the quality of reads usually degrades during the last sequencing cycles, it would be better to define read duplicates as those sharing the start coordinates, I mean, not requiring them to share also the end coordinates.

Is this how Picard/Samtools define them?


pcr duplicates • 982 views
ADD COMMENTlink written 3.2 years ago by abascalfederico1.1k

What I've found out so far... I do not have an exact answer but it seems available software mark duplicates by comparing only 5' coordinates (including clipping if present). If paired reads are at hand, the 5' coordinates of the first and second mates have to be identical (with respect to another pair of reads) to consider the pair a duplicate.

ADD REPLYlink written 3.2 years ago by abascalfederico1.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1773 users visited in the last hour