Question

Whats the point of genotype pruning before imputation?

1

Entering edit mode

4.8 years ago

optimistsso4co3 ▴ 110

Hey guys,

The point of imputation is to predict un-typed SNPs using D' or r2 values. However, it is standard to remove all SNPs in LD before imputation e.g. with plink command --indep-pairwise. May i ask what's the purpose of removing actually typed SNP's and then trying to "predict them back"? I'm for sure missing something, but still, curiosity is the point of my job heh

edit: I'm following Joni Coleman tutorial (he uses --indep-pairwise to prune and later remove these snp)

imputation plink --indep-pariwise r2 • 1.6k views

ADD COMMENT • link 4.8 years ago by optimistsso4co3 ▴ 110

1

Entering edit mode

Could you provide a link to that protocl:

standard to remove all SNPs in LD before imputation

ADD REPLY • link 4.8 years ago by zx8754 11k

0

Entering edit mode

Hey Zack, thanks for response! I’m following John Colemans tutorial https://github.com/JoniColeman/gwas_scripts

ADD REPLY • link 4.8 years ago by optimistsso4co3 ▴ 110

score 1 · Answer 1 · 2019-07-17

They are not pruning before imputation, pruning is a step before doing other QC steps, to calculate IBD and remove related individuals, or calculate PCs to remove outlier individuals based on ethnicity. As the final step of QC, there will be plink command something like:

plink --bfile original \
--remove individualsRelated individualsPCoutlier... individualsCallRate ... \
--exclude SNPsHWE... SNPsOtherQC, ... \
--recode oxford \
--out originalQC