Anyone knows how of a tool that is able to create a population statistics VCF File for the Cartagenia variant software?
From the help system, the format it described as follows:
Population Statistics VCF File Format A population statistics file is assumed to already contain the count statistics and should be a VCF file with the following constraints: it contains no samples or FORMAT column it contains following INFO fields. These are properly declared in the VCF meta records so the application can validate the file before parsing.
AC: Allele count in genotypes, for each ALT allele, in the same order as listed AN: Total number of alleles in called genotypes GTC: GenoType Counts. For each ALT allele in the same order as listed = 0/0,0/1,1/1,0/2,1/2,2/2,0/3,1/3,2/3,3/3,etc. Phasing is ignored; hence 1/0, 0|1 and 1|0 are all counted as 0/1. When one or more alleles are not called for a genotype in a specific sample (./., ./0, ./1, ./2, etc.), that sample's genotype is completely discarded for calculating GTC.
An example population statistics VCF file:
##fileformat=VCFv4.1##contig=<ID=1,assembly=b37,length=249250621>
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##INFO=<ID=GTC,Number=G,Type=Integer,Description="GenoType Counts. For each ALT allele in the same order as listed = 0/0,0/1,1/1,0/2,1/2,2/2,0/3,1/3,2/3,3/3,etc. Phasing is ignored; hence 1/0, 0|1 and 1|0 are all counted as 0/1. When one or more alleles is not called for a genotype in a specific sample (./., ./0, ./1, ./2, etc.), that sample's genotype is completely discarded for calculating GTC.">
#CHROM POS ID REF ALT QUAL FILTER INFO
1 6984272 rs7414812 C G 173980.29 PASS AC=570;AN=996;GTC=96,234,168
1 6985908 rs2412220 G A 22348.11 PASS AC=69;AN=996;GTC=431,65,2
1 6985915 . G A 252.49 PASS AC=1;AN=996;GTC=497,1,0
1 6985988 . T C 278.69 PASS AC=3;AN=996;GTC=495,3,0
1 6986109 rs4908574 G A 214803.92 PASS AC=659;AN=996;GTC=59,219,220
It is probably created from multiple sample VCF files or one big merged VCF file with sample columns. One might write a script to do this, but if anyone knows of an existing tool to do this, please let me know.
Thanks