I am doing a resequencing work of 150 populations. After calling SNP and Indel with GATK and filteration, I found that 20% SNP/Indels locate less than 20bp around others. I don't know whether it is reasonable. If not reasonable, what should I do next?
I also think about using "vcftools -thin" to thin SNP/Indels. But it seems too simple and rude.