Question: vcf not indexing
4
gravatar for alex
3.5 years ago by
alex170
United States
alex170 wrote:

Trying to index vcf file but getting the following

tabix -p vcf dbsnp_138.hg19.vcf.gz
Not a BGZF file: dbsnp_138.hg19.vcf.gz
tbx_index_build failed: dbsnp_138.hg19.vcf.gz

 

Thoughts on how to proceed?  Thanks!

tabix • 5.7k views
ADD COMMENTlink modified 3.5 years ago by Sean Davis25k • written 3.5 years ago by alex170
20
gravatar for Sean Davis
3.5 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

Looks to me like the dbsnp file is not bgzipped?  

gunzip dbsnp_138.hg19.vcf.gz
bgzip dbsnp_138.hg19.vcf
tabix -p vcf dbsnp_138.hg19.vcf.gz
ADD COMMENTlink written 3.5 years ago by Sean Davis25k
2

@Sean Davis you just saved me a lot of frustration. I found this after a few searches and it worked. THanks

ADD REPLYlink written 17 months ago by jespinoz20

I had the same problem: when I compressed with gunzip <file>.vcf tabix gave the error: tbx_index_build failed:<file>.vcf.gz but it worked with `bgzip <file>.vcf.

ADD REPLYlink written 20 days ago by marongiu.luigi330
1

Of course it does. bgzip does blockwise (therefore the b in bgzip) compression of the file, which tabix relies on. That enables tabix to quickly retrieve data by only very partially decompressing a (sometimes hugh) file, guided by the index. gzip does not do blockwise compression.

ADD REPLYlink modified 20 days ago • written 20 days ago by ATpoint8.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1143 users visited in the last hour