To filter exome sequence data and remove false positive I know read depth and Phred score are routinely applied.
But there are following items (related to quality) which I would like to know is there any threshold/cut off for them? and In which step of filtration strategy I should applied them?
GC = GC content within 20 bp +/- the variant
FS = Phred-scaled p-value using Fisher's exact test to detect strand bias. If the reference‐carrying reads are balanced betweenforward and reverse strands then the alternate‐carrying reads should be as well
HRun = Largest Contiguous Homopolymer Run of Variant Allele In Either Direction
HW = Phred-scaled p-value for Hardy-Weinberg violation. Extreme variations on heterozygous calls indicate a false positive call
HaplotypeScore = Consistency of the site with at most two segregating haplotypes (Probability that the reads in a window around the variant can be explained by at most two haplytopes)
MQ0Fraction = RMS (Root Mean Square, also known as quadratic mean) Mapping Quality. Regions of excessively low mapping quality are ambiguously mapped and variants called within are suspicious
MQRankSum = Z-score from Wilcoxon rank sum test of Alt vs. Ref read mapping qualities. If the alternate bases are more likely to be found on reads with lower MQ than reference bases then the site is likely mismapped
QD = Variant confidence/quality by depth
ReadPosRankSum = Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias. If the alternate bases are biased towards the beginning or end of the reads then the site is likely a mapping artifact
SB = Strand Bias
BaseQualityRankSumTest = The u-based z-approximation from the Mann-Whitney Rank Sum Test for base qualities (ref bases vs.bases of the alternate allele).