Are there any best practices for variant calling how to avoid mapping errors in segmental duplication regions?
I have a computer science background and was analyzing a VCF file (hg19) for academic purposes. I noticed the following 2 variants:
chrX 153006137 . G A 202.77 PASS AC=1;AF=0.5;AN=2;BaseQRankSum=-2.842;ClippingRankSum=0;DP=43;ExcessHet=3.0103;FS=3.938;MLEAC=1;MLEAF= 0.5;MQ=55.16;MQRankSum=-6.426;QD=4.72;ReadPosRankSum=1.71;SOR=0.165 GT:AD:DP:GQ:PGT:PID:PL 0/1:35,8:43:99:0|1:153006137_G_A:231,0,2247 chrX 153006141 . T A 199.77 PASS AC=1;AF=0.5;AN=2;BaseQRankSum=-2.162;ClippingRankSum=0;DP=40;ExcessHet=3.0103;FS=3.807;MLEAC=1;MLEAF= 0.5;MQ=54.78;MQRankSum=-6.184;QD=4.99;ReadPosRankSum=0.948;SOR=0.206 GT:AD:DP:GQ:PGT:PID:PL 0/1:32,8:40:99:0|1:153006137_G_A:228,0,2275
QUAL and GQ look ok to me. However, GT is 0/1 for both, but these results are from WES of a male human.
The variants are not in pseudoautosomal regions. However they do lie in a segmental duplication region.
From my interpretation (would you agree?), these variants have been wrongly mapped:
- chrX:153006137-G-A should be chr2:92028747
- chrX:153006141-T-A should be chr2:92028751-T-A