Question: Removing Duplicates for Variant calling when using genes as a reference?
gravatar for DR
2.7 years ago by
European Union
DR10 wrote:

Hello everybody!

I am planning to perform variant calling on a gene catalog that I retrieve from metagenomic samples. I wonder how would duplicates removal or marking could affect this. The point is that my reads have overall read length of 250bp ,thus I thought high proportion of secondary alignments may be real alignments to neighboring genes rather than real PCR duplicates during library preparation and I do not know to what extent removing would decrease sensitivity for detection. Do you have any suggestion?

Many thanks!

sequencing snp next-gen gene • 1.4k views
ADD COMMENTlink modified 2.2 years ago by Biostar ♦♦ 20 • written 2.7 years ago by DR10
gravatar for Kevin Blighe
2.7 years ago by
Kevin Blighe61k
Kevin Blighe61k wrote:


The removal of PCR / optical duplicates is a topic that always make for a good debate. The exact wet-lab process that was performed is critical.

What I suggest that you do is first detect PCR duplicates with Picard MarkDuplicates and then gauge whether or not you should remove them. In some circumstances, again based on the wet-lab process, it's just not feasible or correct to remove duplicates.

At the end of the day, if you do extensive testing with and without duplicates, I think that you'll find that your results will mostly stay the same.


ADD COMMENTlink written 2.7 years ago by Kevin Blighe61k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 704 users visited in the last hour