Extract the position from a huge vcf file
1
0
Entering edit mode
3.2 years ago
the_cowa ▴ 40

I have a list with chromosome and the position as like this

 Chr1 254
 Chr5 8965
 ChrX 25
 ChrY 8965
 Chr19 2354

and need to extract the position from a huge vcf file of size 1 TB. So far I have used

bcftools view -T Locations.txt Input.vcf.gz >Output.vcf

But it is taking days to finish. Is there any method/programme to speed up the run OR is it possible to run with tabix ?

vcf bcftools tabix • 1.2k views
ADD COMMENT
1
Entering edit mode

Try tabix. bgzip the vcf, index the vcf and try tabix -R with the input OP list.

ADD REPLY
1
Entering edit mode
3.2 years ago

use option -R , not option -T

-R, --regions-file <file>           restrict to regions listed in a file
-T, --targets-file [^]<file>        similar to -R but streams rather than index-jumps. Exclude regions with "^" prefix

furthermore your input file is not a BED file (chrom start end). I'm not sure it will work. (May be ?)

ADD COMMENT

Login before adding your answer.

Traffic: 2111 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6