VCF Statistics
12 days ago
vlip • 0

Hi. I currently have a VCF file from my GBS run and I would like to get its summary statistics. Specifically, I wanted to get the number of variants, overall proportion missing (%), proportion heterozygous (%), and the average depth of coverage. Would you know any possible way to get this?

Thank you very much!

VCF satistics variants stats
\$ bcftools stats

thanks! I tried bcftools but the overall proportion missing and heterozygous is not included in the output.

but the overall proportion missing and heterozygous is not included in the output.

?

 bcftools stats --samples - ~/src/jvarkit-git/src/test/resources/rotavirus_rf.vcf.gz | grep -i miss -A5
# PSC   [2]id   [3]sample   [4]nRefHom  [5]nNonRefHom   [6]nHets    [7]nTransitions [8]nTransversions   [9]nIndels  [10]average depth   [11]nSingletons [12]nHapRef [13]nHapAlt [14]nMissing
PSC 0   S1  36  2   5   3   4   2   0.0 90  0   0
PSC 0   S2  30  8   7   1   14  0   0.0 00  0   0
PSC 0   S3  30  8   7   1   14  0   0.0 00  0   0
PSC 0   S4  31  6   5   4   7   3   0.0 13  0   0   0
PSC 0   S5  37  8   0   2   6   0   0.0 80  0   0

10 days ago
Medhat 9.3k

Try to use rtg vcfstats available here. Likewise, you can use SnpSift here, but you will need to filter and create statistics that suits your need.