I am trying to hunt for very low frequency substitutions in MiSeq ultra-deep (targeted amplicon) sequencing. The problem is the very vast amount of noises in high coverages. As you can see in the picture below, there are a large number of (partly randomly) scattered pseudo substitutions all around my amplicons. I don't have this problem when I am looking at WES data. I was told that this is somehow normal to see the noise. But the problem is how to distinguish between these noises and real verly low frequency substitutions? Some of them have frequencies near zero and are easy to filter out but what about those with frequencies close to 1%? Also, to get a better estimate of real allele frequencies, I need to consider the amount of noise in calculating the frequencies. For example, if I find a real susbstitution with allele frequency close to 1%, how would I know how much of this 1% is real and how much of it is noise?