Convert and filter VCF by a pre-existing BIM
1
0
Entering edit mode
6.9 years ago
jlawlor • 0

Hi all, I'm trying to convert a VCF to PLINK format (BED/BIM/FAM), however I'd like to do so using a fixed, pre-existing BIM file. In other words, I'd like to filter an arbitrary VCF down to the SNPs listed in a particular BIM file (and also add in a reference or no-call if the SNP isn't in said VCF).

Is this possible using built-in commands for PLINK? (I'd prefer not to go the route of manually editing PED/MAP files so that I don't inadvertently swap data.)

Context: I'm running ancestry analysis with ADMIXTURE (https://www.genetics.ucla.edu/software/admixture/) in projection mode to project a new sample onto an already-analyzed reference population, which requires identical BIM files. I can do this manually by converting, joining, and manipulating the files; however, I'd like to integrate this into an automated pipeline.

Thanks!

plink VCF admixture • 1.9k views
ADD COMMENT
0
Entering edit mode
6.9 years ago

Assuming your variants have unique IDs, you can use plink --write-snplist (or Unix "cut -d [delimiter] -f 2") on the .bim file to create a list of variant IDs to keep, and then plink --extract to keep just those variants in another dataset.

ADD COMMENT
0
Entering edit mode

(Evidently --extract works with a BIM file without the unix cut command. handy!)

That's nearly what I need--however, the output BIM (let's call it test_set.bim) now has fewer variants than the original "ids_to_keep.bim." Is there a flag I can add or a second step I can perform to fill in the missing genotypes? (Preferably with no calls.)

So far I've tried --fill-missing-a2, but that doesn't seem to be what I want, or I've made some mistake somewhere, since it gives me the error "Error: --fill-missing-a2 cannot be used on an unsorted .bim file."

Using: ~/tools/plink/plink --extract ids_to_keep.bim --vcf input.vcf --make-bed --out test_set

Thanks for the advice!

ADD REPLY

Login before adding your answer.

Traffic: 2813 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6