Question

Improving accuracy in small regions with Impute2

0

Entering edit mode

9.0 years ago

lel7lel7 ▴ 30

Hi All,

In the Impute2 manual they prefer the standard Impute2 MCMC algorithm for fine-scale imputation of small genome regions but don't explain how to do this. Any help would be appreciated.

I'm using Impute2 to impute the whole genomes of 1000 individuals in 5Mb blocks. WITH pre-phasing I am getting ~90% concordance for chromosomes 1-13, and 85% concordance for chromosomes 14-22. This is before filtering for low MAF etc. Then I ran the impute on two 5Mb blocks without pre-phasing to try and improve the accuracy in these target regions. I only see a 0.2% improvement in accuracy for one block but for the second block the pre-phased impute is more accurate (by 0.2%).

My function for this fine-scale impute is as follows...

eval $IMPUTE \
-g chr"$chr".gen \
-m $REF/genetic_map_chr"$chr"_combined_b37.txt \
-h 1000GP_Phase3_chr"$chr".hap \
-l $REF/1000GP_Phase3_chr"$chr".legend \
-int $start $stop \
-buffer 1000 \
-align_by_maf_g \
-Ne 20000 \
-k 100 \
-o $OUTDIR/chr"$chr".$start.$end.one.phased.impute2 \
-phase

Basically I have just ran Impute2 without the -g prephase_g, known_haps_g and use_prephase_g flags. I have also increased the -k from 80 SNP and increased the buffer region from 250kb to 1Mb. Is this correct? Has anyone else tried this method?

Thanks in advance,
Lesley

Impute2 accuracy SNP imputation genome • 2.1k views

ADD COMMENT • link updated 20 months ago by Ram 43k • written 9.0 years ago by lel7lel7 ▴ 30