I am doing Genome Wide Association Study for Indian population. So far, study went well but now stuck with imputation. For this I have following two queries:

1) Since there is no standard Indian reference file, can I use 1000 genome Phase I V3 reference for my imputation in Indian population?

2) What does mean by following files? I mean, where will I get following mentioned files?

--snps chr.snps ----- SNPs in phased haplotypes. These should largely be a subset of the SNPs in the reference panel.

--haps chr.haps.gz ----- Phased haplotypes where missing genotypes will be imputed.

For any help, thanks in advance.

It would be very helpful to mention which software are you using.

About your first question:

If you use IMPUTE2 then you can use the complete 1000 Genome reference panel. IMPUTE2 will try to find the best subset of the reference population based on your study panel. Imputation with IMPUTE2 even with unrelated population gets fairly good results (although of course is always bettwer to have a population specific reference panel)

