Do I need to mark duplicates for sequencing data using multiplex PCR to prepare library?
3.9 years ago
MatthewP ★ 1.1k

Hello. I have some mtDNA sequencing data using multiplex PCR to prepare library, which means using many primers to 'copy' whole mtDNA from whole genome DNA by PCR, then use those fragments to perform sequencing. I used fastp to filtered my raw data, and the result shows:

Filtering result:
reads failed due to low quality: 19374
reads failed due to too many N: 68
reads failed due to too short: 38
bases trimmed due to adapters: 687212

Duplication rate: 96.892%


It has extremely high duplication rate, I think this may caused by library prepare for using multiplex PCR. My question is should I mark duplicates in such situation? Thanks everyone.

mtDNA picard multiplex PCR • 953 views
What question do you want to answer? As the mitochondrial genome is only like 17kb, excessive duplication rate is normal and expected.

2
3.9 years ago

Hello,

No, you shouldn't mark the duplicates. As by design all your reads are duplicates.

fin swimmer