I'm looking for official algorithm documentation on Picard MarkDuplicates and SamTools rmdup, but I can't find it. I have found numerous posts in "google land" where people state why one is better, but I want to know exactly how they both work (preferably without going through the code). For example, I have "heard" that MarkDuplicates is more "intelligent" because it allegedly considers variants within a read rather than just looking at where reads begin.
Can anyone point me towards a paper or documentation that discusses the true differences between the algorithms?
Thanks for your help!
Thank you Pierre, that's similar to what I've been hearing, but do you know where this is documented? I'm curious where you learned it. Thanks!
I looked at the sources.
Great, thanks Pierre!