Hello all,

I used HaplotypeCaller from GATK to generate VCF from 50 individuals. My reference is Rattus norvegicus and 48/50 individuals are Rattus norvegicus. The remaining two individuals are Rattus rattus species. The WGS for these two individuals have been mapped on the same reference, Rattus norvegicus, as the other 48 individuals. In my final VCF output almost all positions in these two Rattus rattus individuals is reported as ./. I checked using IGV and I find read coverage at all of those sites (avg coverage is around 15x) I have not applied any filter while calling the variants. in fact, variants are called from the same positions when only these two individuals are being used as input files with the same reference.

I have posted this on GATK support. But wondering if anybody in this community has experienced the same and can point out why I am I seeing ./. for only these two samples?

Thank you.

Edit: I calculated the percentage of missing data and found 96% for one individual and 99% for the other individual indicating that the samples are being read.

Did you get a response yet on the GATK Support Forum? Can you provide the link to there?

Can you paste some commands that you've used prior to the generation of the VCFs?


