I am trying to impute genome wide SNP data with impute2. The problem is, I'm a biologist and far from an expert in this informatics field. Perhaps you can help me. I installed the software and the examples you can find on the website work very well. But now I want to run the analysis with my data. I think about a pre-phasing and afterwards an imputation into pre-phased haplotypes with two phased reference panels. (HapMap and 1000G, only CEU samples).
I don't have problems generating the -g and the corresponding strand_g files. But the -m, -h , -l files are confusing. In the tutorial you find a downloadlink for these reference files. I download the 1000 Genomes Pilot + HapMap 3 CEU build 36-file (500mb) and it seems that the required information (m, h, l) is included in this file. But I'm not sure and don't know how to type it in the command line and run the analysis. In the examples there are separate files for each flag. How can I handle this?
The next point is I have problems to tell the programm to use only the required information of chromosome 1 for example. Do I have to generate files for every chromosome? And if it's right, how? I didn't find any command in impute2 to restrict the data for a specific chromosome.
I know that these are rather basic questions, but maybe someone has a quick and easy solution.
platform is windows