Question

Selecting SNPs to impute in Plink

0

Entering edit mode

9.5 years ago

hessjl ▴ 90

Hi all,

I want to construct a workflow that will allow me to impute small batches of SNPs that were untyped in my GWAS data (in .bed .bim .fam fomat).

Is it possible to meaningfully subset my reference panel (hapmap release 23) so that I only impute the ~8000 missing SNPs? Would it be erroneous to start by subsetting the reference by tag SNPs to those 8000 SNPs, then merging the reference and GWAS data, then imputing genotypes?

Thanks for the help in advance!

plink impute • 2.9k views

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.5 years ago by hessjl ▴ 90

Ram · Answer 1 · 2014-11-18

1

Entering edit mode

9.5 years ago

chrchang523 10k

Is there a reason you still want to use Plink here? Its imputation algorithm was already considered obsolete 5 years ago: see http://www.ncbi.nlm.nih.gov/pubmed/19089453.

I suggest choosing a more effective imputation tool and applying it to your data. BEAGLE, IMPUTE2, and MaCH continue to be reasonable choices, and it should be straightforward to convert your data to their formats. (1000 Genomes phase 3 reference panels are also now available for all three programs; they should yield more accurate results than HapMap release 23.)

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.5 years ago by chrchang523 10k

0

Entering edit mode

Thanks for this insight! As subtext, are you indicating that whole GWAS imputation is the way to go; or can I still pre-filter my reference set to expedite the imputation turnaround?

ADD REPLY • link updated 2.2 years ago by Ram 43k • written 9.5 years ago by hessjl ▴ 90

0

Entering edit mode

Impute2 should support this sort of variant filtering: see http://mathgen.stats.ox.ac.uk/impute/impute_v2.html#ex4 .

ADD REPLY • link 9.5 years ago by chrchang523 10k