1
0
Entering edit mode
3 days ago

I am attempting to extract a couple of variants from a bgen file. I use the command below, which works, but it takes a large chunk of time. The largest chunk of runtime is spent converting all of the variants in the file, most of which are subsequently dropped because they are absent in the SNPs file. I am wondering why plink bothers to convert all variants in the first place. Initially, I thought it could be due to the file type change (bgen to vcf), but toying around with other output formats hasn't noticeably changed anything. Is there a way around plink parsing and/or converting all variants in the file?

plink2 \
--bgen data/chrom1.bgen ref-last \
--sample data/chrom1.sample \
--chr 1 \
--extract snps.txt \
--export vcf \
--out foo

2
Entering edit mode
3 days ago

This is discussed at https://www.cog-genomics.org/plink/2.0/input#keep_autoconv . If you intend to run plink2 more than once on the same dataset, you should convert to the appropriate plink binary format before proceeding.

0
Entering edit mode

So. Much. Better. Thank you.

Traffic: 1133 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.