Question: Is there a way to force a specific order for variants in a PLINK file that have the same chromosome and bp position?
0
gravatar for curious
8 months ago by
curious430
curious430 wrote:

I have a VCF that I am reading into PLINK.

This VCF has many variants that are redundant in position (same chrom + base pair position).

I read this VCF into PLINK to do some data manipulations, then convert PLINK to VCF using the internal PLINK functionality.

I am noticing that the order of variants in the VCF produced by PLINK that have redundant positions do not always write out in an order that reflects the original VCF (im talking about the actual variants not the flipping of alleles).

i.e.

The order in the original VCF:

variant A (redundant position with variant B)

variant B (redundant position with variant A)

might write out form PLINK in the reverse order:

variant B (redundant position with variant A)

variant A (redundant position with variant B)

All the variants with unique positions seem fine.

Is there a way to force a variant order for these redundant variants by using a reference file or is there a way to have PLINK output a file that lists varaints in the exact order that they will be written to a VCF? I thought about using the bim file as a reference for the order that PLINK will use, but I am not sure if that is accurate. Thank you.

plink vcf • 279 views
ADD COMMENTlink modified 8 months ago by chrchang5237.3k • written 8 months ago by curious430
0
gravatar for chrchang523
8 months ago by
chrchang5237.3k
United States
chrchang5237.3k wrote:

(edit: oops, this answers the wrong question. Leaving this up to provide context for the OP’s response.)

With plink 1.9, —keep-allele-order preserves allele order in the current run, and —a2-allele can be used to re-import allele order.

plink 2.0 defaults to preserving allele order.

ADD COMMENTlink modified 8 months ago • written 8 months ago by chrchang5237.3k

But this is for major minor order of a single variant correct? I am looking for the order of the variants themselves (ie order of rsids)

ADD REPLYlink written 8 months ago by curious430

Yeah, I misread your question.

One way to enforce a specific ID order is:

  1. Create a temporary plink fileset with one sample (give it a new sample ID) and the desired variant ID order. All genotypes can be missing.

  2. plink --bfile temp --bmerge <real fileset> --out merged

  3. plink --bfile merged --remove temp.fam --make-bed sorted

This should work because —bmerge uses the variant ID order in the base fileset when possible.

ADD REPLYlink modified 8 months ago • written 8 months ago by chrchang5237.3k

Thank you so much and PLINK is amazing!

ADD REPLYlink written 8 months ago by curious430
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 688 users visited in the last hour