2/50 inidividuals show missing data at almost all positions in final vcf generated using HaplotypeCaller
Entering edit mode
4.5 years ago

Hello all,

I used HaplotypeCaller from GATK to generate VCF from 50 individuals. My reference is Rattus norvegicus and 48/50 individuals are Rattus norvegicus. The remaining two individuals are Rattus rattus species. The WGS for these two individuals have been mapped on the same reference, Rattus norvegicus, as the other 48 individuals. In my final VCF output almost all positions in these two Rattus rattus individuals is reported as ./. I checked using IGV and I find read coverage at all of those sites (avg coverage is around 15x) I have not applied any filter while calling the variants. in fact, variants are called from the same positions when only these two individuals are being used as input files with the same reference.

I have posted this on GATK support. But wondering if anybody in this community has experienced the same and can point out why I am I seeing ./. for only these two samples?

Thank you.

Edit: I calculated the percentage of missing data and found 96% for one individual and 99% for the other individual indicating that the samples are being read.

GATK vcf missingdata HaplotypeCaller • 998 views
Entering edit mode

Did you get a response yet on the GATK Support Forum? Can you provide the link to there?

Can you paste some commands that you've used prior to the generation of the VCFs?



Login before adding your answer.

Traffic: 1852 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6