I am looking to make a simple SNP analysis.
I have different individuals from which we have targeted specific markers. Then the reads I have come from amplicon sequencing. My questions are:
1) Do I have to remove duplicates ? From what I understand, tools like Picard look for the same 5', but by definition, amplicon sequencing reads start by the same position?
2) If no: how can I treat these data, because if if an error is propagate during the PCR, it will be a bad call at the end ?
Edit: 3) There are 2 type of duplicates: optical and pcr, in that case do I have to remove only optical duplicates ? if yes, do you know how ? seems that Picard doest not separate optical and pcr.
Thanks for your help.