Question

Order of Allel 1 and 2 in .bim-File

0

Entering edit mode

7.2 years ago

LadyMorelin • 0

Dear all,

due to data protection constraints I have to permute my patient data in the .ped-file (while keeping the SNPs and the ordering of the SNPs unchanged) before sending the data to my collaboration partner. I didn't change the .map file.

Now I generated the .bim/.bed/.fam-files using PLINK out of the original data set and the permuted data set to check whether the results differ. Unfortunately, the.bim-files have some entries where allel 1 and 2 are ordered differently meaning that allel 1 became allel 2 and vice versa. I am not a Bioinformatic person and so my question is: Does this matter? Why is the order of the patients important for the definition of allel 1/2?

Thanks in advance!

SNP plink • 2.5k views

ADD COMMENT • link updated 7.2 years ago by Petr Ponomarenko ★ 2.8k • written 7.2 years ago by LadyMorelin • 0

score 0 · Answer 1 · 2017-03-03

If I remember right, plink estimates allele frequency using first chunk of data and assigned the most frequent one to be the first allele. There is a way to avoid this by providing reference alleles explicitly with --reference-allele parameter but then you will have trouble with triallelic+ sites. Any way, with downstream analysis with plink this is not important and 1/2 shows unphased data, so order in genotype is not important.

Please correct me if I remember wrong. Thank you