I try to get an subset of SNPs common for a three of six individuals from a multiVCF file. My problem is that I get different results when I do it
- manually (extract genotypes using "snpsift extractFields" and filter the variants by the excel)
when I use "snpsift filter "
SnpSift varType snps_results_dir/my_multiVCF_SNPs.vcf | SnpSift filter "isVariant( GEN) & isVariant( GEN) & isVariant( GEN) & isRef( GEN) & isRef( GEN) & isRef( GEN) & isRef( GEN) & isRef( GEN)" > my_SNP_subset.vcf
With the first method I get 479 SNPs, however "snpsift filter" (second method) gives me about 250 SNPs.
So I'm confused. What is the right method/result? Could somebody help me with this question/discrepance? Are there any other standard procedure to filter the variants from the multiVCF file?
Thank you very much in advance