plink2 reorder alleles by fasta reference
1
0
Entering edit mode
2.9 years ago
Filago ▴ 90

Hello,

I have a plink 2.0 file ("Ref" is always the major allele) and want to reorder its ref and alt alleles by a fasta reference. For this I use the --ref-from-fa command. However, the following error occurs:

Error: --ref-from-fa wants to change reference allele assignment at X:2700157,but it's marked `as 'known'. Add the 'force' modifier to force this change`

What does marked as "known" mean? How should I go on?

Best,

Andreas

PLINK2 • 2.1k views
ADD COMMENT
0
Entering edit mode
2.9 years ago

plink 2.0 marks a reference allele as "provisional" instead of "known" when it comes from a plink 1 .bed file, or a VCF directly generated by plink from such a .bed file. Some of these reference alleles are expected to be wrong.

However, when a reference allele comes from a regular VCF file, it's expected to be correct; that's why an additional --ref-from-fa modifier is required to change it.

ADD COMMENT
0
Entering edit mode

Hmmm. My data does not come from a "regular vcf" file, but from a standard GWAS-microarray workflow with PLINK. How would you proceed? --> "Force" it?

ADD REPLY
0
Entering edit mode

Yes, if you know the supposed REF alleles are really just major alleles, that's what "force" is for.

ADD REPLY
0
Entering edit mode

If I do so, I get the following report:

--ref-from-fa force: 4685 variants changed, 1 validated.

In total I only have 4686 variants...

The fasta reference which I use is from here: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/human_g1k_v37.fasta.gz

And here is my command:

plink2 --pfile file --ref-from_fasta --fa reference.fasta --make-pgen --out output

What could be wrong here?

Thanks for your help!

ADD REPLY
0
Entering edit mode

Perhaps your original dataset had "backwards" alleles (with REF usually A1 instead of A2) for some reason, while still being correctly encoded. You can sanity-check this by running --freq on your updated dataset: if the alternate allele frequencies are usually low, you're probably fine.

ADD REPLY

Login before adding your answer.

Traffic: 2517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6