Hello,
I performed variant calling and genotyping of my samples using GATK 4.2.6.1. What I found in the VCF file is that despite having only 2 genotypes separated with "/" or "|", the number of allelic depths (ADs) is larger than the number of alleles (i.e., in the example below 3 ADs for 2 alleles or 4 ADs for 1 allele in the second example). What is the reason for this?
1|2:11,70,65:146:99:0|1:2857836_C_*:5031,2213,2478,2377,0,2643:2857836
3|3:0,0,0,4:4:13:1|1:1933943_TTAAGGTAG_T:139,139,139,139,139,139,13,13,13,0:1933943
I thought that in this case only the depths for alleles 1 and 2 (or for the allele 3 in the second example should be given). Do the rest of ADs correspond to the other alleles for example of the reference or of other samples?
cheers
show us the whole line of this VCF, at least CHROM POS ID REF ALT QUAL FILTER INFO FORMAT
And for the second genotype above
I agree with @leipzig. For this position there are 4 alleles total: 1 reference (G), and 3 alternatives (T,C, *; where * is overlapping a deletion in GATK), so the AD annotation describes the number of unfiltered reads that support each allele in the order reported.
Here, in sample Bc_ref_BXQ_D-illumina_reads at position contig_9:1933948 you have 0 reads supporting G, 2 supporting T, 4 reads supporting C, and 0 reads supporting a deletion.