What is duplicate reads in sequencing?
Entering edit mode
7 months ago
donny.dw ▴ 20

From internet:

There are two main sources of duplicates: polymerase chain reaction (PCR) duplicates and natural duplicates. Unlike natural duplicates that represent true signals from sequencing of independent DNA templates, PCR duplicates are artifacts originating from sequencing of identical copies amplified from the same DNA template.

A DNA or RNA fragment will be amplifed for around 2^10 times during library prep steps. Why the duplicate reads is bad for sequencing?

reads duplicate • 287 views
Entering edit mode

They don't provide a lot of new information. A PCR copying error occurring at an early cycle will be propagated into all the other copies. Thus, if one sees a short variant only in the same fragment copied by PCR to a large number of copies, they usually discard this as an artefact. There are smart ways to do PCR for analysis such as liquid biopsy and it includes unique barcoding of initial DNA pieces, but it is not a standard practice.

Some techniques (such as AmpliSeq) heavily rely on multiplex PCR and removal of PCR duplicates there is very tricky since almost all the reads come from the duplicated fragments.


Login before adding your answer.

Traffic: 2051 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6