Question: Gatk Pipeline: Markduplicates At The End ?
1
gravatar for Pierre Lindenbaum
5.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum112k wrote:

The GATK Mark the Duplicates at the end of their pipeline, after merging the BAMs .

In order to remove the optical duplicates and for each lane, I would have put this operation after the alignment with BWA for each lane/sample (= parallelization = faster)

Is there any reason to mark the duplicates at this position in their pipeline ?

http://cdn.vanillaforums.com/gatk.vanillaforums.com/FileUpload/55/0a67f9e1b7962a14c422e993f34643.jpeg

ADD COMMENTlink modified 11 months ago by Biostar ♦♦ 20 • written 5.9 years ago by Pierre Lindenbaum112k
3
gravatar for Stefano Berri
5.9 years ago by
Stefano Berri4.0k
Cambridge, UK
Stefano Berri4.0k wrote:

As far as I understand, here they are sequencing the same LIBRARY in different lanes. I don't know what you mean by "optical duplicates", but what you want to get rid of are PCR duplicate, i.e. the same molecule (produced during PCR amplification) sequenced twice. Either two "spots" on the same lane or in different lanes. That's why you need to mark duplicate after you have merged all reads from a particular library. I guess you could mark duplicate also before, but definetely you need to do it after the merging.

I hope this helps

ADD COMMENTlink written 5.9 years ago by Stefano Berri4.0k

optical duplicate=two spots, close to each other, mapping the same fragment.

ADD REPLYlink written 5.9 years ago by Pierre Lindenbaum112k
1

Ok, then those should be very few. PCR duplicates can be many more and more serious. We did have some libraries with up to 70% PCR duplicates. Clearly no good libraries.

ADD REPLYlink written 5.9 years ago by Stefano Berri4.0k

OK, the "PCR duplicates" is a good argument.

ADD REPLYlink written 5.9 years ago by Pierre Lindenbaum112k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 590 users visited in the last hour