Question: Variant calling step order question: base recalibration & mark duplicates, which is first?
gravatar for alons
18 months ago by
alons270 wrote:

Hi all,

We're going through & revising our variant calling pipeline on NGS data from cancer patients and a question came up:

Which step should be done first (and why), base recalibration or mark duplicates?

Currently we recalibrate bases first and then mark duplicates.

The reason I'm asking this is that we originally based part of our pipeline on the following article, which said that you recalibrate bases and then mark duplicates:

However, in the following Broad Institute best practices page it says the opposite, you mark duplicates and then recalibrate bases, saw it in another paper as well:

Thanks in advance!


ADD COMMENTlink modified 18 months ago by Brian Bushnell16k • written 18 months ago by alons270

As per GATK best practices workflow here,, mark duplicates first, followed by base recalibration.

ADD REPLYlink written 18 months ago by cpad011211k
gravatar for mforde84
18 months ago by
mforde841.2k wrote:

I'd probably remove duplicates first, since BSRC is generating some sort of covariation model with all of the supplied reads. I'm assuming that having a bunch of clonal artifacts in your dataset might throw this off a little. But honestly, you should ask the GATK people as they have a better understanding of the underlying model.

ADD COMMENTlink modified 18 months ago • written 18 months ago by mforde841.2k
gravatar for Brian Bushnell
18 months ago by
Walnut Creek, USA
Brian Bushnell16k wrote:

Recalibrating bases should not really improve (or affect) duplicate detection. But duplicate removal can improve recalibration, so I'd do that first. And the earlier you remove duplicates, the faster everything else becomes.

ADD COMMENTlink written 18 months ago by Brian Bushnell16k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 743 users visited in the last hour