##fileformat=VCFv4.0
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FILTER=<ID=q10,Description="Quality below 10">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT A B
1 15 . AGTAGTCATACATCAT A 1806 q10 DP=35 GT:GQ:DP 1/1:409:35 1/1:409:35
1 19 . G T 1792 PASS DP=32 GT:GQ:DP 0/0:245:32 0/0:245:32
1 25 . C G 628 q10 DP=21 GT:GQ:DP 0/1:245:32 0/1:245:32
run vcftools
vcftools --vcf subset.vcf --remove-indels --out SNPs_only --recode
cat SNPs_only.recode.vcf
##fileformat=VCFv4.0
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FILTER=<ID=q10,Description="Quality below 10">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT A B
1 19 . G T 1792 PASS . GT:GQ:DP 0/0:99:32 0/0:99:32
1 25 . C G 628 q10 . GT:GQ:DP 0/1:99:32 0/1:99:32
now try with an individual exhibiting both alleles. can it decompose these before the filter?
##fileformat=VCFv4.0
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FILTER=<ID=q10,Description="Quality below 10">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT A B
1 15 . A G,AG 1806 q10 DP=35 GT:GQ:DP 0/1:409:35 1/1:409:35
1 25 . C G 628 q10 DP=21 GT:GQ:DP 0/1:245:32 0/1:245:32
nope
##fileformat=VCFv4.0
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FILTER=<ID=q10,Description="Quality below 10">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT A B
1 25 . C G 628 q10 . GT:GQ:DP 0/1:99:32 0/1:99:32
I think you would really have to define your expected behavior explicitly.
What does it mean for an insertion and a substitution to "overlap"?
what if it the substitution overlaps with the indel in the same sample - one allele for each?
Say that you have INDEL starting at position 15 AGTAGTCATACATCAT
At position 19 you have a SNP G T
and at position 25 you have a SNP C G
Are those SNPs retained?