I am trying to get a set of heuristics to predict which exome sequencing variants (illumina + Agilent or TruSeq capture) would or would not validate with sanger sequencing. We have a problem with this in our lab. A lot of variants called by samtools + another tool don't validate with sanger and looking at the bam files with samtools tview the alingment and base qualities seem to be ok. These are mostly heterozygote calls and I noticed that in most cases the variant allele is 20-30 % of the reads with a minimum depth of 10 and a range of 30x coverage.
What should I look for when I am trying to decide which call is less reliable, essentially can I do a better job at assigning a snp score to variants by manually scrutinizing the bam file and if so what is the protocol to do that?