I am interested in the distribution of SNPs in different populations. I know that if you have a typical biallelic SNP the frequency of the major and minor alleles can be different between populations. Can they ever differ to the extent that one allele is the major allele in one population whereas the other allele is the major allele in a different population.
This then leads me on to the definition of a reference sequence for a SNP:
Is the reference sequence for a SNP simply the base that was present at this locus in the DNA that was sequenced for the reference genome (or the base that had highest occurence if reference genome was heterozygous at that locus). Or is the definition of the reference allele sequence linked in any way to the major allele for that SNP. If it is the latter then I was wondering how you take into account different populations. If it is the former then the population issue is irrelevant.