Question: Convert and filter VCF by a pre-existing BIM
gravatar for jlawlor
3.4 years ago by
USA/Huntsville/HudsonAlpha Institute for Biotechnology
jlawlor0 wrote:

Hi all, I'm trying to convert a VCF to PLINK format (BED/BIM/FAM), however I'd like to do so using a fixed, pre-existing BIM file. In other words, I'd like to filter an arbitrary VCF down to the SNPs listed in a particular BIM file (and also add in a reference or no-call if the SNP isn't in said VCF).

Is this possible using built-in commands for PLINK? (I'd prefer not to go the route of manually editing PED/MAP files so that I don't inadvertently swap data.)

Context: I'm running ancestry analysis with ADMIXTURE ( in projection mode to project a new sample onto an already-analyzed reference population, which requires identical BIM files. I can do this manually by converting, joining, and manipulating the files; however, I'd like to integrate this into an automated pipeline.


plink admixture vcf • 986 views
ADD COMMENTlink modified 3.4 years ago by chrchang5237.3k • written 3.4 years ago by jlawlor0
gravatar for chrchang523
3.4 years ago by
United States
chrchang5237.3k wrote:

Assuming your variants have unique IDs, you can use plink --write-snplist (or Unix "cut -d [delimiter] -f 2") on the .bim file to create a list of variant IDs to keep, and then plink --extract to keep just those variants in another dataset.

ADD COMMENTlink written 3.4 years ago by chrchang5237.3k

(Evidently --extract works with a BIM file without the unix cut command. handy!)

That's nearly what I need--however, the output BIM (let's call it test_set.bim) now has fewer variants than the original "ids_to_keep.bim." Is there a flag I can add or a second step I can perform to fill in the missing genotypes? (Preferably with no calls.)

So far I've tried --fill-missing-a2, but that doesn't seem to be what I want, or I've made some mistake somewhere, since it gives me the error "Error: --fill-missing-a2 cannot be used on an unsorted .bim file."

Using: ~/tools/plink/plink --extract ids_to_keep.bim --vcf input.vcf --make-bed --out test_set

Thanks for the advice!

ADD REPLYlink written 3.4 years ago by jlawlor0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 866 users visited in the last hour