ChrX allele frequency in males and females
6 days ago
kani.kanchan ▴ 10


I have a question about the allele frequency output files (.frq) from VCFtools. My question is specific to chromosome X variants. I am writing out the allele frequencies for males and females separately to identify the variants with different allele frequencies. The dataset includes 84 males and 42 females, but two samples had discordance between the reported sex and genetic sex. This was an issue of sample swap and was confirmed by IBS analysis. Later, I used the --reheader command from bcftools to remap (changed the names) the samples in the VCF.

Given that males have one copy of chrX (except the PAR), .frq files for males should have the maximum entries at 84, and the .frq files for females should have entries in range but max at 84 (42*2). However, in my case, males are showing a max entry at 85, and females are showing a max entry at 83 in the N_CHR column.

Can someone please help me understand what I am missing here?


5 days ago

Show us some chrX and chrY GT entries in the VCF for males.

They should be like 0 or 1 or 0/. or 1/. (instead of 0/0 or 0/1). Most variant callers don't distinguish sex chromosomes but if you used --sample-ploidy in GATK you will see those hemizygous GT calls. That will allow VCFtools to calculate allele frequency properly.


