Question: Order of Allel 1 and 2 in .bim-File
LadyMorelin0 wrote:

Dear all,

due to data protection constraints I have to permute my patient data in the .ped-file (while keeping the SNPs and the ordering of the SNPs unchanged) before sending the data to my collaboration partner. I didn't change the .map file.

Now I generated the .bim/.bed/.fam-files using PLINK out of the original data set and the permuted data set to check whether the results differ. Unfortunately, the.bim-files have some entries where allel 1 and 2 are ordered differently meaning that allel 1 became allel 2 and vice versa. I am not a Bioinformatic person and so my question is: Does this matter? Why is the order of the patients important for the definition of allel 1/2?

Thanks in advance!

snp plink • 1.1k views
ADD COMMENTlink modified 2.1 years ago by Petr Ponomarenko2.6k • written 2.1 years ago by LadyMorelin0
United States / Los Angeles /
Petr Ponomarenko2.6k wrote:

If I remember right, plink estimates allele frequency using first chunk of data and assigned the most frequent one to be the first allele. There is a way to avoid this by providing reference alleles explicitly with --reference-allele parameter but then you will have trouble with triallelic+ sites. Any way, with downstream analysis with plink this is not important and 1/2 shows unphased data, so order in genotype is not important.

Please correct me if I remember wrong. Thank you

ADD COMMENTlink written 2.1 years ago by Petr Ponomarenko2.6k
