Question: Does MarkDuplicates remove duplicate sequence in library?
0
gravatar for Shixiang
6 months ago by
Shixiang50
Shanghai
Shixiang50 wrote:

Dear all,

I have a question about sequencing and BQSR mark duplicates which may stupid.

One key step of GATK is MarkDuplicates, which removes duplicates like PCR duplicates. My question is that if there are duplicate segments when construct library? And if MarkDuplicates removes such duplicates and why?

Best, Shixiang

sequencing wes gatk • 156 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by Shixiang50
1
gravatar for finswimmer
6 months ago by
finswimmer13k
Germany
finswimmer13k wrote:

Hello Shixiang ,

you are mixing things up. BQSR (Base Quality Score Recalibration) and MarkDuplicates are two different things.

1 BQSR (Base Quality Score Recalibration):

The quality values assigned to each base within a read by sequencing machine gets reassigned by new values. These new values are meant to be more correct.

More information about it: https://gatkforums.broadinstitute.org/gatk/discussion/44/base-quality-score-recalibration-bqsr

2 MarkDuplicate

During library preparation you have PCR steps resulting in fragments that are copies of one and the same original dna molecule. Based on the most 5' mapping position those duplicates are recognized and only one will be retained. The reason for removing such duplicates is to avoid introducing a bias if one original molecule is overrepresented due to some amplification bias. Note: If your library prep is amplicon based, which means you use pcr to get your target region, do not remove duplicates. Because all your reads are virtually duplicates.

fin swimmer

ADD COMMENTlink written 6 months ago by finswimmer13k

Thanks, I have corrected my question. My data is WES (including tumor and normal) download from NCBI, I use it for mutation calling and copy number calling. Should I remove duplicates?

ADD REPLYlink written 6 months ago by Shixiang50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1050 users visited in the last hour