I have been using bcftools stats, but I’m uncertain about what several fields in the output mean. The documentation is good for what the command line options do, but has no breakdown of what the output means or how it is calculated.
This is part of the output from vcftools stats on my file:
# SN, Summary numbers:
# SN id key value
SN 0 number of samples: 4301
SN 0 number of records: 803
SN 0 number of SNPs: 714
SN 0 number of MNPs: 0
SN 0 number of indels: 94
SN 0 number of others: 7
SN 0 number of multiallelic sites: 33
SN 0 number of multiallelic SNP sites: 2
Things I can't find in the documentation:
- Does “multiallelic” denote “more than 2 alleles” rather than “not monomorphic”?
- The number of SNPS+indels+others does not sum to the total number of records. Is this because an SNP can also be an indel or “other?”
- What types of variants are covered by “others” here?
- Why is the id field blank for all sections of my output?