Hi, I want to use PICARD tools markduplicates option, but after reading the manual I am still not sure I understand the method used. http://broadinstitute.github.io/picard/command-line-overview.html#MarkDuplicates It reads: "The MarkDuplicates tool works by comparing sequences in the 5 prime positions of both reads and read-pairs in a SAM/BAM file"
Does this mean duplicates are marked based on their chr+start position and the 5'-sequence? or does the tool take the full sequence into account by using the CIGAR data?
Thanks in advance.