Selecting SNPs to impute in Plink
1
0
Entering edit mode
9.5 years ago
hessjl ▴ 90

Hi all,

I want to construct a workflow that will allow me to impute small batches of SNPs that were untyped in my GWAS data (in .bed .bim .fam fomat).

Is it possible to meaningfully subset my reference panel (hapmap release 23) so that I only impute the ~8000 missing SNPs? Would it be erroneous to start by subsetting the reference by tag SNPs to those 8000 SNPs, then merging the reference and GWAS data, then imputing genotypes?

Thanks for the help in advance!

plink impute • 2.9k views
ADD COMMENT
1
Entering edit mode
9.5 years ago

Is there a reason you still want to use Plink here? Its imputation algorithm was already considered obsolete 5 years ago: see http://www.ncbi.nlm.nih.gov/pubmed/19089453.

I suggest choosing a more effective imputation tool and applying it to your data. BEAGLE, IMPUTE2, and MaCH continue to be reasonable choices, and it should be straightforward to convert your data to their formats. (1000 Genomes phase 3 reference panels are also now available for all three programs; they should yield more accurate results than HapMap release 23.)

ADD COMMENT
0
Entering edit mode

Thanks for this insight! As subtext, are you indicating that whole GWAS imputation is the way to go; or can I still pre-filter my reference set to expedite the imputation turnaround?

ADD REPLY
0
Entering edit mode

Impute2 should support this sort of variant filtering: see http://mathgen.stats.ox.ac.uk/impute/impute_v2.html#ex4 .

ADD REPLY

Login before adding your answer.

Traffic: 1862 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6