Question: plink2 bgen to vcf ukbiobank
0
gravatar for richyanicky
14 months ago by
richyanicky10
richyanicky10 wrote:

Hello

I have imputed data from ukbiobank in bgen format. I would like to convert it to a vcf file.

I can use plink2 to make pgen files and then use plink2 again to create a vcf

plink2 --bgen ukb_imp_chr17_v3.bgen --sample ukimp_chr17_v3_s.sample --make-pgen

plink2 --pgen plink2.pgen --pvar plink2.pvar --psam plink2.psam  --export vcf

This creates a vcf file but it doesn't seem to process in any of our pipelines.

  1. Does what I did look correct?
  2. How do I check the vcf file for accuracy?

Thank you in advance ,

Richard

software error • 1.5k views
ADD COMMENTlink modified 3 months ago by bcole10 • written 14 months ago by richyanicky10

Hi Richard,

Did you figure out what was the problem? I am having the same issue.

Thanks!

ADD REPLYlink written 4 months ago by aj0

Yes I used the comment below by chrchang253 and it worked..

ADD REPLYlink written 4 months ago by richyanicky10

Further question for Chris here: for the UK Biobank BGEN data, what's the proper REF/ALT mode?

Warning: No --bgen REF/ALT mode specified ('ref-first', 'ref-last', or 'ref-unknown'). This will be required as of alpha 3.

ADD REPLYlink written 3 months ago by bcole10

The alpha 3 error message explicitly notes that UK Biobank BGENs use 'ref-first' encoding.

ADD REPLYlink written 3 months ago by chrchang5237.1k
3
gravatar for chrchang523
14 months ago by
chrchang5237.1k
United States
chrchang5237.1k wrote:

Assuming you want dosage information in your VCF, you need to replace "--export vcf" with something like "--export vcf vcf-dosage=DS". You may also want to add the 'bgz' modifier to request bgzipping of the VCF file.

ADD COMMENTlink modified 14 months ago • written 14 months ago by chrchang5237.1k
1
gravatar for bcole
3 months ago by
bcole10
bcole10 wrote:

Note that you can do this in one step if you want, e.g.

plink2 --bgen ukb_imp_chr21_v3.bgen --sample ukb_imp_chr21_v3_s.sample --export vcf vcf-dosage=DS

Quick note about converting UK Biobank BGEN to VCF - I first tried this using QCTOOL and after 15 days only ~1.1 million lines from each chromosome had written. At that pace it would take ~15 weeks to convert chromosome 1 to VCF.gz using QCTOOL.

I then saw this post and installed a plink2 module on my HPC and was able to convert chr21 UKBB BGEN to VCF (not bgz) in 98 minutes! This means that plink2 seems to convert BGEN to VCF many, many times faster than QCTOOL.

As Chris points out, just the chr21 (the smallest autosome) VCF file from UKBB was 2.4TB, so you should definitely consider using the bgz modifier to reduce file size.

ADD COMMENTlink modified 3 months ago • written 3 months ago by bcole10

how large it is for ukb_imp_chr21_v3.bgen?

ADD REPLYlink written 5 weeks ago by Shicheng Guo8.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 668 users visited in the last hour