bcftools compressing and indexing vcf files
2
7
Entering edit mode
3.3 years ago

Hello,

I am trying to merge multiple VCF files using bcftools but it threw an error saying that the file is not compressed.

I want to know if the right command to compress the file would be:

bcftools view -I input.vcf -O z -o output.vcf

Should I index the files too before merging ? If so, which command should I be using ?

Thanks in advance.

sequence bcftools vcf index • 17k views
ADD COMMENT
10
Entering edit mode
3.2 years ago

Hello Inquisitive8995 ,

short answers: Yes and yes :)

Long answer:

Compressing a vcf file can be done in two ways:

  1. using bcftools view as you show
  2. using bgzip -c input.vcf > output.vcf.gz

Whether you should index your compressed vcf file or not, depends on what tool you like to run afterwards on that file. Some require it, some not. But the tools that don't need an index file, also doesn't make use of the advantage to have random access to the positions in the vcf file.

There are two indexing format: tbiand csi. Unfortunately I cannot tell what's the difference between them (maybe other can tell something about it). tbiis the standard index format if you use tabix input.vcf.gzand csiis the standard format if you use bcftools index input.vcf.gz. tabix and bcftoolsprovide both an option to use the other indexing format. At least bcftools can work with both index format.

Depending on the size of your vcf file it might be useful to convert them to compressed bcf instead of compressed vcf. bcftools is designed for working with bcf. So in every step it converts vcf to bcf for calculation and back vcf afterwards. This produces a massive overhead. If you have bcf from the beginning it is much faster (In my last test 2-4 times).

fin swimmer

ADD COMMENT
5
Entering edit mode
3.3 years ago

compress with bgzip and index with bcftools index your.vcf.gz

ADD COMMENT

Login before adding your answer.

Traffic: 3924 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6