Hi all.
I am now handling VCF format file to understand its characteristics and to have a wide application.
While managing my vcf files, I have some questions and below is my one of my example of vcf rows.
1 898921 . C G,<X> 0 . DP=211;I16=112,51,1,0,7057,346989,16,256,8061,399183,50,2500,3220,73296,25,625;QS=0.997431,0.00256946,0;SGB=-0.379885;RPB=1;MQB=1;MQSB=0.792466;BQB=1;MQ0F=0 PL:DP:DV:DPR 0,255,255,255,255,255:164:1:163,1,0
As you can see, I used four format types which are PL,DP,DV and DPR and their explanation are as follows
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Number of high-quality bases">
##FORMAT=<ID=DV,Number=1,Type=Integer,Description="Number of high-quality non-reference bases">
##FORMAT=<ID=DPR,Number=R,Type=Integer,Description="Number of high-quality bases observed for each allele">
For understanding PL field, my example's PL values are 0,255,255,255,255,255. they are 6 field separated by comma, and I think that the first three of them are used for reference and the others are for alter. To be concrete, in case of 255, which corresponds to 10^(-25.5) (very closely to zero) and the remaining values are same. How can I interpret this formulation? I found that almost values are 255.
Secondly, there are three space separated by comma in DPR field. For example, my example's DPR field is as follows : 163,1,0. From this value, I could know that the first two of them indicate the number of reads which corresponds to each ref/alt. However, what the third columns values which are zero? I didn't get it. Help me!
Just curious, but what software generated this VCF file?