HI, I have a question here since I am little confused for the duplication filtering by using different tools as a beginner.
Most of my SAM format read sequences have the 12th optional field (TAG:TYPE:VALUE), My understand is that XT:A:U means that this read mapped to reference uniquely. As a result, if I filter all the reads on the XT:A:U, I have deleted all the potential duplications and then I do not need to use rmdup of Samtool anymore???
Picard markduplications can only marker or label the reads that are possible duplications, so why we just marker it but not delete all duplicates?
Suppose Picard and samtools can help to remove duplications, which one is more reliable?
Thank you
forget to ask one more question, when I filter XT:A:U, this is risk that one of the reads of pair-end reads will be removed but keep another in the data, right?