Question: Picard MarkDuplicates and SamTools rmdup algorithm documentation
2
gravatar for Mark Ebbert
4.4 years ago by
Mark Ebbert70
United States
Mark Ebbert70 wrote:

Hi,

I'm looking for official algorithm documentation on Picard MarkDuplicates and SamTools rmdup, but I can't find it. I have found numerous posts in "google land" where people state why one is better, but I want to know exactly how they both work (preferably without going through the code). For example, I have "heard" that MarkDuplicates is more "intelligent" because it allegedly considers variants within a read rather than just looking at where reads begin. 

Can anyone point me towards a paper or documentation that discusses the true differences between the algorithms?

Thanks for your help!

Mark

ADD COMMENTlink modified 4.4 years ago by Pierre Lindenbaum115k • written 4.4 years ago by Mark Ebbert70
4
gravatar for Pierre Lindenbaum
4.4 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum115k wrote:

SamTools rmdup 'only' compare two reads on chrom and pos (which could be wrong if two reads come from two different libraries) and **removes** reads from the BAM: information is lost.

picard set the sam flag 1024 but do not delete the reads. two pairs of reads are compared , as far as I know, using the chrom, the pos, the group-id (sample...) + (flowcell , lane, X,Y for optical dups) (,and the cigar string ?).

 

 

 

ADD COMMENTlink written 4.4 years ago by Pierre Lindenbaum115k

Thank you Pierre, that's similar to what I've been hearing, but do you know where this is documented? I'm curious where you learned it. Thanks!

ADD REPLYlink written 4.4 years ago by Mark Ebbert70
4

I looked at the sources.

ADD REPLYlink written 4.4 years ago by Pierre Lindenbaum115k

Great, thanks Pierre!

ADD REPLYlink written 4.4 years ago by Mark Ebbert70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1620 users visited in the last hour