Plink runtime when using "--extract"
1
0
Entering edit mode
3 days ago

I am attempting to extract a couple of variants from a bgen file. I use the command below, which works, but it takes a large chunk of time. The largest chunk of runtime is spent converting all of the variants in the file, most of which are subsequently dropped because they are absent in the SNPs file. I am wondering why plink bothers to convert all variants in the first place. Initially, I thought it could be due to the file type change (bgen to vcf), but toying around with other output formats hasn't noticeably changed anything. Is there a way around plink parsing and/or converting all variants in the file?

plink2 \
    --bgen data/chrom1.bgen ref-last \
    --sample data/chrom1.sample \
    --chr 1 \
    --extract snps.txt \
    --export vcf \
    --out foo 
plink2 bgen • 191 views
ADD COMMENT
2
Entering edit mode
3 days ago

This is discussed at https://www.cog-genomics.org/plink/2.0/input#keep_autoconv . If you intend to run plink2 more than once on the same dataset, you should convert to the appropriate plink binary format before proceeding.

ADD COMMENT
0
Entering edit mode

So. Much. Better. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 1133 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6