Question: VCF sample fields output
0
gravatar for v82masae
14 months ago by
v82masae140
v82masae140 wrote:

Hi,

I am analysing some SNPs in a VCF, and I have found some mutation of interest, but, I would like to know what this output in the samples fields mean.

The two SNPs I am interested in are these:

1   270574995   .   T   C   8523.06 PASS    
AC=64;AF=0.889;AN=72;DP=333;FS=0.000;MQ=60.00;set=Intersection  GT:AD:DP:GQ:PGT:PID:PL  
1/1:0,20:20:60:1|1:270574995_T_C:900,60,0   1/1:0,13:13:39:1|1:270574995_T_C:585,39,0   
0/1:1,5:6:27:0|1:270574995_T_C:207,0,27 1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,12:12:36:1|1:270574995_T_C:540,36,0   1/1:0,16:16:48:1|1:270574995_T_C:720,48,0   
0/1:1,7:8:21:0|1:270574995_T_C:291,0,21 1/1:0,19:19:57:1|1:270574995_T_C:855,57,0   
1/1:0,29:29:87:1|1:270574995_T_C:1305,87,0  1/1:0,21:21:63:1|1:270574995_T_C:945,63,0   
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 1/1:0,6:6:18:1|1:270574995_T_C:249,18,0 
0/1:5,6:11:99:0|1:270574995_T_C:237,0,192   1/1:0,10:10:36:1|1:270574995_T_C:509,36,0   
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/1:2,4:6:99:0|1:270574995_T_C:159,0,99 
1/1:0,7:7:21:1|1:270574995_T_C:305,21,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:5,0:5:0:.:.:0,0,92  
1/1:0,3:3:9:1|1:270574995_T_C:135,9,0   1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 1/1:0,11:11:33:1|1:270574995_T_C:495,33,0   
1/1:0,8:8:24:1|1:270574995_T_C:355,24,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 
1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 
1/1:0,9:9:27:1|1:270574995_T_C:372,27,0 1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 
1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:4,0:4:12:.:.:0,12,109   
1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 1/1:0,11:11:33:1|1:270574995_T_C:495,33,0

1   270574996   .   T   A   8523.06 PASS    AC=64;AF=0.889;AN=72;DP=335;MQ=60.00;set=Intersection   
GT:AD:DP:GQ:PGT:PID:PL  1/1:0,20:20:60:1|1:270574995_T_C:900,60,0   
1/1:0,13:13:39:1|1:270574995_T_C:585,39,0   0/1:1,5:6:27:0|1:270574995_T_C:207,0,27 
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,12:12:36:1|1:270574995_T_C:540,36,0   
1/1:0,16:16:48:1|1:270574995_T_C:720,48,0   0/1:1,7:8:21:0|1:270574995_T_C:291,0,21 
1/1:0,19:19:57:1|1:270574995_T_C:855,57,0   1/1:0,29:29:87:1|1:270574995_T_C:1305,87,0  
1/1:0,21:21:63:1|1:270574995_T_C:945,63,0   1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 
1/1:0,5:5:18:1|1:270574995_T_C:249,18,0 0/1:5,6:11:99:0|1:270574995_T_C:237,0,192   
1/1:0,12:12:36:1|1:270574995_T_C:509,36,0   1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 
0/1:3,4:7:99:0|1:270574995_T_C:159,0,99 1/1:0,7:7:21:1|1:270574995_T_C:305,21,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:5,0:5:0:.:.:0,0,92  1/1:0,3:3:9:1|1:270574995_T_C:135,9,0   
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 
1/1:0,11:11:33:1|1:270574995_T_C:495,33,0   1/1:0,8:8:24:1|1:270574995_T_C:355,24,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 
1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 1/1:0,9:9:27:1|1:270574995_T_C:372,27,0 
1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:4,0:4:12:.:.:0,12,109   1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 
1/1:0,11:11:33:1|1:270574995_T_C:495,33,0

These are consecutive SNPs, one very deleterious and the other one, compensating it. I would like to know why the samples fields, where I get the genotype for both alleles in each sample, look that way. I would expect to hace these field like, say, 1/1:0,20:20:60:1, but I get 1/1:0,20:20:60:1|1:270574995_T_C:900,60,0. Why is that? I've checked other SNPs and they look as expected.

I would also like to know why the second mutation have the first mutation cited in the samples fields.

Anyone know if this is a special type of output meaning something? Or simply I should not care about it?

Thanks

snp sample vcf • 352 views
ADD COMMENTlink modified 11 months ago by Biostar ♦♦ 20 • written 14 months ago by v82masae140
1

You should look at your VCF header. From the FORMAT column, it is evident the field you're looking for information on is called PID, so look for that in the header's ##FORMAT section.

ADD REPLYlink written 14 months ago by RamRS22k

I looked at the header and indeed it is refering to the PID and PGT fields. I have been looking about the meaning of this, related to physical phasing. From what I have understood, this applies for consecutive variants or near variants. I do not understand what are the implications of that, as I also noticed that Allele Frequency (AF) are the same for both SNPs.

Could this mean that both SNPs are always present as an haplotype and always segregate together?

ADD REPLYlink written 14 months ago by v82masae140

this applies for consecutive variants or near variants

it means that the variants are located on the same homologous chromosome.

ADD REPLYlink written 11 months ago by Pierre Lindenbaum121k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 970 users visited in the last hour