Question: Should PCR duplicates always be removed?
1
gravatar for jtwalker
17 months ago by
jtwalker20
jtwalker20 wrote:

I am working with fairly low coverage GBS data (average <11 read depth), and as such am wondering if it makes sense for me to remove PCR duplicates from my data, as it seems that these are just adding extra depth. Does anyone know if there is a general rule that should be followed in this situation?

Thanks!

edit: I forgot to mention that I have paired end reads, and I'm not sure if this will change the answer to my question.

pcr duplicates samtools gbs • 1.0k views
ADD COMMENTlink modified 8 months ago by Biostar ♦♦ 20 • written 17 months ago by jtwalker20
5
gravatar for Fabio Marroni
17 months ago by
Fabio Marroni2.0k
Italy
Fabio Marroni2.0k wrote:

Actually, I think that GBS is one of the very few applications in which you can avoid removing PCR duplicates, because the space that you are sequencing is usually small enough to guarantee that you will find perfect duplicates by chance alone. However, this depends on several properties. For example, having paired end reads and such a low coverage. you should have few duplicates. If you have a lot, then you have a problem, and your reads represent more the PCR artifacts than the distribution of the sample. I also suggest that you refer to some of the several software packages develpoed for working on GBS data, such as STACKS: https://github.com/enormandeau/stacks_workflow

EDIT: As Eric correctly pointed out, the correct link to STACKS is this: http://catchenlab.life.illinois.edu/stacks/ I apologize for the mistake.

ADD COMMENTlink modified 13 months ago • written 17 months ago by Fabio Marroni2.0k
1

Hi Fabio. Thanks for refering to my Github repository. However, my code is not the source of the STACKS package, just a set of scripts to manage GBS projects and run STACKS itself.

STACKS can be found here: http://catchenlab.life.illinois.edu/stacks/

ADD REPLYlink written 13 months ago by Eric Normandeau10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1370 users visited in the last hour