Hello everyone,
I'm studying the parameters of bcftools view and trying to understand them better, so I've done two tests, one with bcftools -c
only and the other with bcftools -e
only and compared them with bcftools stats (results bellow).
I've read on the manual that the -c
option forces the -e
option. My question is, if one forces another, the number of SNPs in common shouldn't be higher? I understand that there are many SNPs called only by the -e
option, but shouldn't all the SNPs called by the -c
option be present in the -e as well?
Does anyone know where could I find a good documentation about how the -c
and -e
options work?
Thanks in advance!
# This file was produced by bcftools stats (1.2+htslib-1.2.1) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats S37.um2Q20.c.vcf.bgzip S37.um2Q20.e.vcf.bgzip
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 S37.um2Q20.c.vcf.bgzip
ID 1 S37.um2Q20.e.vcf.bgzip
ID 2 S37.um2Q20.c.vcf.bgzip S37.um2Q20.e.vcf.bgzip
# SN, Summary numbers:
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 1 number of samples: 1
SN 0 number of records: 258033024
SN 0 number of SNPs: 691841
SN 0 number of MNPs: 0
SN 0 number of indels: 3020
SN 0 number of others: 0
SN 0 number of multiallelic sites: 4872
SN 0 number of multiallelic SNP sites: 4694
SN 1 number of records: 212276037
SN 1 number of SNPs: 1454264
SN 1 number of MNPs: 0
SN 1 number of indels: 20454
SN 1 number of others: 0
SN 1 number of multiallelic sites: 1460927
SN 1 number of multiallelic SNP sites: 1454264
SN 2 number of records: 12043
SN 2 number of SNPs: 115
SN 2 number of MNPs: 0
SN 2 number of indels: 11928
SN 2 number of others: 0
SN 2 number of multiallelic sites: 677
SN 2 number of multiallelic SNP sites: 115