Question: vcf not indexing
8
gravatar for alex
5.5 years ago by
alex220
United States
alex220 wrote:

Trying to index vcf file but getting the following

tabix -p vcf dbsnp_138.hg19.vcf.gz
Not a BGZF file: dbsnp_138.hg19.vcf.gz
tbx_index_build failed: dbsnp_138.hg19.vcf.gz

 

Thoughts on how to proceed?  Thanks!

tabix • 12k views
ADD COMMENTlink modified 5.5 years ago by Sean Davis26k • written 5.5 years ago by alex220
30
gravatar for Sean Davis
5.5 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

Looks to me like the dbsnp file is not bgzipped?  

gunzip dbsnp_138.hg19.vcf.gz
bgzip dbsnp_138.hg19.vcf
tabix -p vcf dbsnp_138.hg19.vcf.gz
ADD COMMENTlink written 5.5 years ago by Sean Davis26k
2

@Sean Davis you just saved me a lot of frustration. I found this after a few searches and it worked. THanks

ADD REPLYlink written 3.4 years ago by jespinoz20

I had the same problem: when I compressed with gunzip <file>.vcf tabix gave the error: tbx_index_build failed:<file>.vcf.gz but it worked with `bgzip <file>.vcf.

ADD REPLYlink written 2.0 years ago by marongiu.luigi520
2

Of course it does. bgzip does blockwise (therefore the b in bgzip) compression of the file, which tabix relies on. That enables tabix to quickly retrieve data by only very partially decompressing a (sometimes hugh) file, guided by the index. gzip does not do blockwise compression.

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by ATpoint39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 977 users visited in the last hour