Question: Population statistics VCF file format for Cartagenia?
0
gravatar for hhj
3.0 years ago by
hhj0
hhj0 wrote:

Anyone knows how of a tool that is able to create a population statistics VCF File for the Cartagenia variant software?

From the help system, the format it described as follows:

Population Statistics VCF File Format A population statistics file is assumed to already contain the count statistics and should be a VCF file with the following constraints: it contains no samples or FORMAT column it contains following INFO fields. These are properly declared in the VCF meta records so the application can validate the file before parsing.

AC: Allele count in genotypes, for each ALT allele, in the same order as listed AN: Total number of alleles in called genotypes GTC: GenoType Counts. For each ALT allele in the same order as listed = 0/0,0/1,1/1,0/2,1/2,2/2,0/3,1/3,2/3,3/3,etc. Phasing is ignored; hence 1/0, 0|1 and 1|0 are all counted as 0/1. When one or more alleles are not called for a genotype in a specific sample (./., ./0, ./1, ./2, etc.), that sample's genotype is completely discarded for calculating GTC.

An example population statistics VCF file:

##fileformat=VCFv4.1##contig=<ID=1,assembly=b37,length=249250621>
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##INFO=<ID=GTC,Number=G,Type=Integer,Description="GenoType Counts. For each ALT allele in the same order as listed = 0/0,0/1,1/1,0/2,1/2,2/2,0/3,1/3,2/3,3/3,etc. Phasing is ignored; hence 1/0, 0|1 and 1|0 are all counted as 0/1. When one or more alleles is not called for a genotype in a specific sample (./., ./0, ./1, ./2, etc.), that sample's genotype is completely discarded for calculating GTC.">
#CHROM POS ID REF ALT QUAL FILTER INFO
1 6984272 rs7414812 C G 173980.29 PASS AC=570;AN=996;GTC=96,234,168 
1 6985908 rs2412220 G A 22348.11 PASS AC=69;AN=996;GTC=431,65,2 
1 6985915 . G A 252.49 PASS AC=1;AN=996;GTC=497,1,0 
1 6985988 . T C 278.69 PASS AC=3;AN=996;GTC=495,3,0 
1 6986109 rs4908574 G A 214803.92 PASS AC=659;AN=996;GTC=59,219,220

It is probably created from multiple sample VCF files or one big merged VCF file with sample columns. One might write a script to do this, but if anyone knows of an existing tool to do this, please let me know.

Thanks

statistics format population vcf • 1.4k views
ADD COMMENTlink modified 3.0 years ago by geek_y9.9k • written 3.0 years ago by hhj0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1898 users visited in the last hour