To filter exome sequence data and remove false positive I know read depth and Phred score are routinely applied.
But there are following items (related to quality) which I would like to know is there any threshold/cut off for them? and In which step of filtration strategy I should apply them(at the beginning or at the end)?
GC = GC content within 20 bp +/- the variant
HRun = Largest Contiguous Homopolymer Run of Variant Allele In Either Direction
HW = Phred-scaled p-value for Hardy-Weinberg violation. Extreme variations on heterozygous calls indicate a false positive call
MQ0Fraction = RMS (Root Mean Square, also known as quadratic mean) Mapping Quality. Regions of excessively low mapping quality are ambiguously mapped and variants called within are suspicious
SB = Strand Bias
BaseQualityRankSumTest = The u-based z-approximation from the Mann-Whitney Rank Sum Test for base qualities (ref bases vs.bases of the alternate allele).