I'm working on trying to create a filter for the low quality SNPs of my data, and I thought of basing the filter on the ranking values (https://www.biorxiv.org/content/early/2017/10/27/209965).
I lose ~99% of my SNPs when I only look at the SNPs with ranking higher than 0.2. This filter feels a tad stringent.
I'm using the data of one individual (one read set) to find the SNPs present between two haplotypes. Is the ranking value an appropriate value to use to filter the quality of SNPs within one individual? or should I try to do this work with all of my individuals (multiple read sets)?