Question: VCF sample fields output
0
gravatar for v82masae
9 months ago by
v82masae100
v82masae100 wrote:

Hi,

I am analysing some SNPs in a VCF, and I have found some mutation of interest, but, I would like to know what this output in the samples fields mean.

The two SNPs I am interested in are these:

1   270574995   .   T   C   8523.06 PASS    
AC=64;AF=0.889;AN=72;DP=333;FS=0.000;MQ=60.00;set=Intersection  GT:AD:DP:GQ:PGT:PID:PL  
1/1:0,20:20:60:1|1:270574995_T_C:900,60,0   1/1:0,13:13:39:1|1:270574995_T_C:585,39,0   
0/1:1,5:6:27:0|1:270574995_T_C:207,0,27 1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,12:12:36:1|1:270574995_T_C:540,36,0   1/1:0,16:16:48:1|1:270574995_T_C:720,48,0   
0/1:1,7:8:21:0|1:270574995_T_C:291,0,21 1/1:0,19:19:57:1|1:270574995_T_C:855,57,0   
1/1:0,29:29:87:1|1:270574995_T_C:1305,87,0  1/1:0,21:21:63:1|1:270574995_T_C:945,63,0   
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 1/1:0,6:6:18:1|1:270574995_T_C:249,18,0 
0/1:5,6:11:99:0|1:270574995_T_C:237,0,192   1/1:0,10:10:36:1|1:270574995_T_C:509,36,0   
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/1:2,4:6:99:0|1:270574995_T_C:159,0,99 
1/1:0,7:7:21:1|1:270574995_T_C:305,21,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:5,0:5:0:.:.:0,0,92  
1/1:0,3:3:9:1|1:270574995_T_C:135,9,0   1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 1/1:0,11:11:33:1|1:270574995_T_C:495,33,0   
1/1:0,8:8:24:1|1:270574995_T_C:355,24,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 
1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 
1/1:0,9:9:27:1|1:270574995_T_C:372,27,0 1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 
1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:4,0:4:12:.:.:0,12,109   
1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 1/1:0,11:11:33:1|1:270574995_T_C:495,33,0

1   270574996   .   T   A   8523.06 PASS    AC=64;AF=0.889;AN=72;DP=335;MQ=60.00;set=Intersection   
GT:AD:DP:GQ:PGT:PID:PL  1/1:0,20:20:60:1|1:270574995_T_C:900,60,0   
1/1:0,13:13:39:1|1:270574995_T_C:585,39,0   0/1:1,5:6:27:0|1:270574995_T_C:207,0,27 
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,12:12:36:1|1:270574995_T_C:540,36,0   
1/1:0,16:16:48:1|1:270574995_T_C:720,48,0   0/1:1,7:8:21:0|1:270574995_T_C:291,0,21 
1/1:0,19:19:57:1|1:270574995_T_C:855,57,0   1/1:0,29:29:87:1|1:270574995_T_C:1305,87,0  
1/1:0,21:21:63:1|1:270574995_T_C:945,63,0   1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 
1/1:0,5:5:18:1|1:270574995_T_C:249,18,0 0/1:5,6:11:99:0|1:270574995_T_C:237,0,192   
1/1:0,12:12:36:1|1:270574995_T_C:509,36,0   1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 
0/1:3,4:7:99:0|1:270574995_T_C:159,0,99 1/1:0,7:7:21:1|1:270574995_T_C:305,21,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:5,0:5:0:.:.:0,0,92  1/1:0,3:3:9:1|1:270574995_T_C:135,9,0   
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 
1/1:0,11:11:33:1|1:270574995_T_C:495,33,0   1/1:0,8:8:24:1|1:270574995_T_C:355,24,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 
1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 1/1:0,9:9:27:1|1:270574995_T_C:372,27,0 
1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:4,0:4:12:.:.:0,12,109   1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 
1/1:0,11:11:33:1|1:270574995_T_C:495,33,0

These are consecutive SNPs, one very deleterious and the other one, compensating it. I would like to know why the samples fields, where I get the genotype for both alleles in each sample, look that way. I would expect to hace these field like, say, 1/1:0,20:20:60:1, but I get 1/1:0,20:20:60:1|1:270574995_T_C:900,60,0. Why is that? I've checked other SNPs and they look as expected.

I would also like to know why the second mutation have the first mutation cited in the samples fields.

Anyone know if this is a special type of output meaning something? Or simply I should not care about it?

Thanks

snp sample vcf • 262 views
ADD COMMENTlink modified 6 months ago by Biostar ♦♦ 20 • written 9 months ago by v82masae100
1

You should look at your VCF header. From the FORMAT column, it is evident the field you're looking for information on is called PID, so look for that in the header's ##FORMAT section.

ADD REPLYlink written 9 months ago by RamRS20k

I looked at the header and indeed it is refering to the PID and PGT fields. I have been looking about the meaning of this, related to physical phasing. From what I have understood, this applies for consecutive variants or near variants. I do not understand what are the implications of that, as I also noticed that Allele Frequency (AF) are the same for both SNPs.

Could this mean that both SNPs are always present as an haplotype and always segregate together?

ADD REPLYlink written 9 months ago by v82masae100

this applies for consecutive variants or near variants

it means that the variants are located on the same homologous chromosome.

ADD REPLYlink written 6 months ago by Pierre Lindenbaum117k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1290 users visited in the last hour