GATK joint genotyping
1
1
Entering edit mode
5 months ago

Hi,

I used GATK HaplotypeCaller to generate gVCFs for 9 samples (BP_RESOLUTION mode), and then used GenotypeGVCFs to do the joint calling.

It's very important for me to know the sites are called or not, so I checked the joint genotyping VCF with all sites kept (no filter added). By extracting the record only for one individual, many sites with 'no-call' were found.

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  M_FR_BOU_ln1
12      20283174        .       C       .       .       .       .       GT:AD:DP:RGQ    ./.:8:12:0
12      20283175        .       T       .       .       .       .       GT:AD:DP:RGQ    ./.:8:11:0
12      20283176        .       G       .       .       .       .       GT:AD:DP:RGQ    ./.:11:12:0
12      20283177        .       C       .       .       .       .       GT:AD:DP:RGQ    ./.:11:12:0
12      20283178        .       G       .       27.48   LowQual .       GT:DP:RGQ       0/0:12:36
12      20283179        .       C       .       .       .       .       GT:AD:DP:RGQ    ./.:12:13:0

However, when I checked back the single gVCF for this individual, these sites were called with genotype.

12      20283174        .       C       <NON_REF>       .       .       .       GT:AD:DP:GQ:PL  0/0:8,4:12:0:0,0,164
12      20283175        .       T       <NON_REF>       .       .       .       GT:AD:DP:GQ:PL  0/0:8,3:11:0:0,0,204
12      20283176        .       G       <NON_REF>       .       .       .       GT:AD:DP:GQ:PL  0/0:11,1:12:0:0,0,399
12      20283177        .       C       <NON_REF>       .       .       .       GT:AD:DP:GQ:PL  0/0:11,1:12:0:0,0,419
12      20283178        .       G       A,<NON_REF>     0       .       .       GT:AD:DP:GQ:PGT:PID:PL:PS:SB    0|0:12,0,0:12:36:0|1:20283171_A_*:0,36,535,36,535,535:20283171:4,8,0,0
12      20283179        .       C       <NON_REF>       .       .       .       GT:AD:DP:GQ:PL  0/0:12,1:13:0:0,0,454

I am not very clear with the mechanism of how joint genotyping works, but is there any explanation of how genotypes of single individual will be affected in this process? Are these sites called or not, and what './. means in both VCF? Any suggestions and comments will be very helpful!

Best, Monica

GenotypeGVCFs VCF GATK • 349 views
ADD COMMENT
1
Entering edit mode
5 months ago

./. means no call. I'm not sure why those positions are not called reference but clearly something has GATK spooked. I guess it just wants more depth. With joint genotyping the other cohort samples play a role in each other's calling, so areas with a lot of quality problems don't get called as aggressively. This is all kind of nuts from a reproducibility standpoint.

ADD COMMENT

Login before adding your answer.

Traffic: 2424 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6