Question: vcf not indexing
6
gravatar for alex
5.1 years ago by
alex200
United States
alex200 wrote:

Trying to index vcf file but getting the following

tabix -p vcf dbsnp_138.hg19.vcf.gz
Not a BGZF file: dbsnp_138.hg19.vcf.gz
tbx_index_build failed: dbsnp_138.hg19.vcf.gz

 

Thoughts on how to proceed?  Thanks!

tabix • 11k views
ADD COMMENTlink modified 5.1 years ago by Sean Davis26k • written 5.1 years ago by alex200
29
gravatar for Sean Davis
5.1 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

Looks to me like the dbsnp file is not bgzipped?  

gunzip dbsnp_138.hg19.vcf.gz
bgzip dbsnp_138.hg19.vcf
tabix -p vcf dbsnp_138.hg19.vcf.gz
ADD COMMENTlink written 5.1 years ago by Sean Davis26k
2

@Sean Davis you just saved me a lot of frustration. I found this after a few searches and it worked. THanks

ADD REPLYlink written 3.0 years ago by jespinoz20

I had the same problem: when I compressed with gunzip <file>.vcf tabix gave the error: tbx_index_build failed:<file>.vcf.gz but it worked with `bgzip <file>.vcf.

ADD REPLYlink written 20 months ago by marongiu.luigi510
2

Of course it does. bgzip does blockwise (therefore the b in bgzip) compression of the file, which tabix relies on. That enables tabix to quickly retrieve data by only very partially decompressing a (sometimes hugh) file, guided by the index. gzip does not do blockwise compression.

ADD REPLYlink modified 20 months ago • written 20 months ago by ATpoint34k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1112 users visited in the last hour