Question

Are these reads all PCR replicates ?

0

Entering edit mode

8.2 years ago

Nicolas Rosewick 11k

Hi,

I did an sequencing run where I enriched specific DNA regions. Thus I expect to have a lot of PCR duplicates. In the figure below you can see a IGV print screen of a specific region. You can see that pretty all reads are the same. But several of them has some mismatches (see arrows) (less than 1% of the reads have some mismatches). Can I consider that they are PCR duplicates ? or are they real different DNA fragments ?

enter image description here

Thanks

dna-seq • 2.1k views

ADD COMMENT • link updated 8.2 years ago by stolarek.ir ▴ 700 • written 8.2 years ago by Nicolas Rosewick 11k

0

Entering edit mode

I think that checking the Phred score for those bases that are different can give you some insights. But in general I don't think there's an easy way to know if two reads are PCR duplicates.

ADD REPLY • link 8.2 years ago by Martombo ★ 3.1k

0

Entering edit mode

they seems to have phred score between 15 and 20. But several of them have good phred score (>30)

ADD REPLY • link 8.2 years ago by Nicolas Rosewick 11k

0

Entering edit mode

What sequencer was used? Do the differences occur in a homopolymer region?

ADD REPLY • link 8.2 years ago by 5heikki 11k

0

Entering edit mode

We used a miSeq and it's not a homopolymer region

ADD REPLY • link 8.2 years ago by Nicolas Rosewick 11k

0

Entering edit mode

Which polymerase enzyme was used in PCR?

ADD REPLY • link 8.2 years ago by 5heikki 11k

score 1 · Answer 1 · 2016-03-02

1

Entering edit mode

8.2 years ago

surendra ▴ 30

Hi,

You can use Picard tools to identify the PCR duplicates (with option MarkDuplicates)

http://broadinstitute.github.io/picard/command-line-overview.html#Overview

ADD COMMENT • link 8.2 years ago by surendra ▴ 30

0

Entering edit mode

It looks like Haloplex data - the last thing you want to do is run MarkDuplicates on it. This is a terrible idea.

ADD REPLY • link 8.2 years ago by User 59 13k

score 0 · Answer 2 · 2016-03-02

0

Entering edit mode

8.2 years ago

Jenez ▴ 540

Those differences could have easily arisen during the sequencing of the fragments, as no sequencing machine is flawless and will produce erroneous sequencing reads.

ADD COMMENT • link 8.2 years ago by Jenez ▴ 540

score 0 · Answer 3 · 2016-03-02

Looking at this picture it looks like sequencing error <- more or less random between sequences, however

In aDNA we observe lots of fixed errors in some portion of the reads, the possible explanation for that is that those mismatches if present in the exactly same place in probable duplicate read come not from sequencing error, but either from polymerase error during PCR or from sample contamination. To test if it really was a polymerase you can do analysis cycle by cycle.