IMPUTE2 variable question
1
0
Entering edit mode
4.3 years ago
raalsuwaidi ▴ 100

Dear all

I am trying to perform an imputation on a VCF file that I have but using my own reference panel. I am getting confused with the parameters though. Here is what I understand:

-g is my vcf file that I am trying to impute 

-known_haps_g is the hap file of the file I am planning to impute in case I do not provide it in -g

-m is the recombination file as per the sample provided by impute2

-h the haplotypes for the reference file

-l the legend file of the reference file

Is that correct?

vcf impute2 imputation • 1.3k views
ADD COMMENT
0
Entering edit mode
4.3 years ago

You have generally got it correct, however, -known_haps_g is used in comjunction with -use_prephased_g, and is the output of a pre-phasing step, ie., with SHAPEIT2. If you use these parameters, then you do not have to supply -g. If you use -g, the file should be in gens format.

All explanations can be found here: https://mathgen.stats.ox.ac.uk/impute/impute_v2.html

If you want a workflow for pre-phasing with SHAPEIT2, then I have coincidentally provided an answer that covers this:

Another related answer:

SHAPEIT2 can accept PLINK data as input. You can convert VCF to PLINK with PLINK, if you want.

Kevin

ADD COMMENT
0
Entering edit mode

Thank you so much. But how can I change my vcf to gen formate

ADD REPLY
0
Entering edit mode

Please take a look at an answer from the PLINK developer: C: How to convert vcf to impute2 format?

Alternatively, if you want to go the pre-phasing route, you can import / convert your VCF to PLINK format via PLINK itself (--vcf or --bcf flags), and then use that as input to SHAPEIT2. SHAPEIT2's output then feeds directly into IMPUTE2 via -use_prephased_g -known_haps_g

ADD REPLY

Login before adding your answer.

Traffic: 3720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6