I have been using the --derived option in vcftools to obtain the frequency of the ancestral and derived allele in africa, america, europe and asia of the 1000 Genomes project.
What I was expecting to get was an equal number of SNPs with ancestry information, however, I get different number of SNPs for different continents that the ancestry information is available to them.
I cannot figure out how this can happen. Any insight would be appreciated.
Thank you very much in advance.