I am analyzing the variant call analysis from my identical twins. (they are confirmed as having a same disease-causing mutation, which was a mendelian disease) .. However, their phenotypes are very different. So, I would like to identify the variants, which are unque for each of twins.
I got vcf file from miSeq buildin tools and have used snpEff and snpSift to call and identify the variants.
Finally I got about 80 each of twins unique varints. ( I only manually filtered out with option read-depth 20)
then, interestingly, many of identifed varians are annotated as "LowVariantFreq/SB" or "SB".
In this situation, I am wondering whether I need to remove all variants annoated as these one filtering options or I can keep and pay attention to those one.. Since if I remove all those variants annoted with "LowVariantFreq/SB" or "SB" filtering, I would only have very smalll fraction of variants. I am a little worried that I might miss the True variants.
(Note that I used the default filtering option for these filtering annotations)
In addition, Some of called variants are mapping to known SNP(having rs_id). I have heard that we could not just throw away those variants since such as 1000 human genome snp has al lot of bias, we could not just assume that the variants having rs_id is really SNP...
Could you please someone give comments?