Question: Vcf File Format Qual Column
0
gravatar for User6891
6.8 years ago by
User6891250
Europe
User6891250 wrote:

I'm generating a VCF file which consists of genotype data from 5 individuals. For each variant from each individual I have a Phred Quality Score. But I want to make a VCF file in which I can put the genotypes of these 5 individuals, so what do I have to fill in then in the QUAL column of my .vcf file? Do I have to sum up the 5 individual Phred scores?

vcf quality • 3.4k views
ADD COMMENTlink written 6.8 years ago by User6891250

Are you sure that pooling genotypes from several individuals into one vcf-file is valid according to the format? Also, keep in mind that Phred scores are actually probabilities

ADD REPLYlink written 6.8 years ago by Andreas2.4k

Normally VCF files can contain genotypes from different individuals for the same position.

Maybe I should give an example of what I want to do: I have a variant file with the following info on each line: Suppose I have a variant on chromosome 1, position 10154262, reference allele = A, alternative allele=G; variant allele frequency+Phred score individual 1; variant allele frequency+Phred score individual 2; variant allele frequency+Phred score individual 3.

Now in a .vcf file you need to fill in for every variant a column named 'QUAL'. But since I have 3 individual Phred Scores, what should I put then in the column QUAL?

ADD REPLYlink written 6.8 years ago by User6891250
1
gravatar for Raony Guimarães
6.8 years ago by
Dublin / Ireland
Raony Guimarães970 wrote:

Use vcf-concat:

vcf-concat A.vcf.gz B.vcf.gz C.vcf.gz | gzip -c > out.vcf.gz

http://vcftools.sourceforge.net/perl_module.html#vcf-concat

ADD COMMENTlink written 6.8 years ago by Raony Guimarães970

Nice. However, this is for merging different vcf files for one individual (one per chromosome etc). What would happen if you had two variants at the same position in the input files?

ADD REPLYlink written 6.8 years ago by Andreas2.4k
0
gravatar for Laura
6.8 years ago by
Laura1.7k
Cambridge UK
Laura1.7k wrote:

If you have no per site score you could use . to represent missing data and supply the per individual score in the individual column along with the genotype

ADD COMMENTlink written 6.8 years ago by Laura1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2122 users visited in the last hour