Can someone explain PLINK allele REF/ALT management strategy?
Entering edit mode
2.3 years ago

Sometimes when merging two plink files, the Reference (REF) and Alternative (ALT) alleles may be reversed, e.g. REF G ALT A versus REF A ALT G.

The main reason for that is the default action of PLINK. You see, when using PLINK with new binary file, it automatically assigns REF to the Minor allele (least frequent). So even if files was aligned to reference, after converting to plink format, it has been inadvertently changed.

The main issue however is that i do not understand why it changing only the .file notation of REF ALT, while .bed file listing each individual and their genotypes stays unchanged. In other words, if snp rs1234 for individual James Smith is 0/1, after PLINK default action it becomes 0/1. Yes, exactly the same, while the "1" has changed from being G to A.

All the options provided by PLINK to reverse REF/ALT does the same - changed the notion in .bim file, but does nothing to .ped (.bed) files.

Can you explain why is there such a behaviour? From my calculations it renders 30% of my .bim alleles incorrect after merge. Am i missing something?

plink • 3.5k views
Entering edit mode
2.3 years ago
  1. A VCF "0/1" genotype is not ordered. It has the same meaning regardless of whether REF=A ALT=G or REF=G ALT=A.
  2. See for discussion of plink 1.x's allele-swapping behavior. (Bottom line: you probably want to use plink 2.0 whenever possible when allele order matters.)

Login before adding your answer.

Traffic: 2299 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6