Question: How to convert IMPUTE2 to VCF format
3
gravatar for lkmklsmn
4.2 years ago by
lkmklsmn890
United States
lkmklsmn890 wrote:

Hi,

I have imputed genotype using IMPUTE2 (version 2.3.2). The output files look like this:  

file

file_allele_probs

file_haps

file_info

file_info_by_sample

file_summary

file_warnings

Now I want to convert these files into VCF format. How can I do this? 

 

conversion impute2 vcf • 7.0k views
ADD COMMENTlink modified 2.9 years ago by dweeks.pitt40 • written 4.2 years ago by lkmklsmn890
2

Feeling pretty peeved about this.

bcftools expects there to be a sample file with extension .samples not .sample, which is what plink produces

It doesn't check if the file exists, doesn't produce an error message. Just crashes with a segmentation fault if the file is not there with that name. I had to run the program in gdb to find this out.

So the command is this: bcftools convert --gensample2vcf test

And the files needed are these: test.gen.gz test.samples

ADD REPLYlink written 3.5 years ago by davenomiddlenamecurtis20

Thanks - the samples got me as well. But now I have Could not parse REF in "CHROM:POS_REF_ALT id: rs7899632:100000625:A:G" grr

ADD REPLYlink written 3.4 years ago by wuttke0

You can fix the identifier using awk:

awk '{split($2,a,":"); $2="10:"a[2]"_"a[3]"_"a[4]; print}'

ADD REPLYlink written 3.4 years ago by plott0
2
gravatar for Alexander Skates
4.2 years ago by
United Kingdom
Alexander Skates350 wrote:

Try QCTOOL.

ADD COMMENTlink written 4.2 years ago by Alexander Skates350

This way worked. It did not require sample-file. Thanks.

Important edit:  

qctool does NOT retain phasing information when converting to vcf format

ADD REPLYlink modified 3.8 years ago • written 4.2 years ago by lkmklsmn890
1
gravatar for Sean
4.2 years ago by
Sean180
United States
Sean180 wrote:

Looks like BCFtools has an option for exactly that. Check out the bcftools convert --gensample2vcf ... command here.

ADD COMMENTlink written 4.2 years ago by Sean180

Seems like this command wants 2 files (gen-file & sample-file). However, my IMPUTE2 output (using default options) does not contain a sample-file. This has been the issue with using other conversion tools such as gtool, plink. 

ADD REPLYlink written 4.2 years ago by lkmklsmn890

IMPUTE2 is written to be sample agnostic by default. IMPUTE2 documentation states: "Currently, the only reason to provide a sample file is if you want to exclude some individuals". You should get the samples from the input dataset you give to IMPUTE2.

ADD REPLYlink written 3.4 years ago by plott0
1
gravatar for dweeks.pitt
2.9 years ago by
dweeks.pitt40
United States
dweeks.pitt40 wrote:

While qctool will convert to VCF format without a sample file, this is only useful if you do not need to know the sample IDs, as when one does this, it generates dummy IDs in the VCF like this:

 sample_1        sample_2        sample_3        sample_4 ...

In all the work we do, we need to know which sample is which, so we have to keep track of sample IDs and can't use dummy sample IDs.

If you do have a sample file in addition to your gen file, then Mega2 can convert from IMPUTE2 format to VCF format. See the Mega2 documentation for details.

ADD COMMENTlink written 2.9 years ago by dweeks.pitt40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1956 users visited in the last hour