Question: Generate vcf.gz file and its index file vcf.gz.tbi
20
gravatar for lyz10302012
7.3 years ago by
lyz10302012330
China
lyz10302012330 wrote:

Can anyone tell me how to generate vcf.gz file and its index file vcf.gz.tbi in 1000 Genomes Project?

ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20110521/

tabix vcf • 74k views
ADD COMMENTlink modified 18 months ago by zx87549.1k • written 7.3 years ago by lyz10302012330
56
gravatar for Matt Shirley
7.3 years ago by
Matt Shirley9.3k
Cambridge, MA
Matt Shirley9.3k wrote:
bgzip -c file.vcf > file.vcf.gz
tabix -p vcf file.vcf.gz

tabix documentation

ADD COMMENTlink modified 17 months ago by RamRS26k • written 7.3 years ago by Matt Shirley9.3k

Does the VCF have to be sorted like SAM/BAM does?

ADD REPLYlink written 4.1 years ago by Dan520

Yes, tabix requires sorted input files. I don't think the sorting order matters, but records must be grouped together by rows.

ADD REPLYlink written 4.1 years ago by Matt Shirley9.3k

You mean numeric or alphanumeric by chromosome in ascending or descending order plus numeric by position ascending or descending order? I can't think how else sorting order could not matter.

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by Dan520

Yes. To clarify, I think you just need your chromosomes grouped together, and then records for each chromosome need to be sorted in ascending coordinate order.

ADD REPLYlink written 4.1 years ago by Matt Shirley9.3k
20
gravatar for Erik Garrison
7.3 years ago by
Erik Garrison2.2k
Napoli, IT / UCSC
Erik Garrison2.2k wrote:

I have a script which does this using a VCF stream on stdin:

#!/bin/bash

file=$1

bgzip >$file
tabix -f -p vcf $file

I found I was always writing the same lines over and over when indexing VCF files. You can use it like this:

cat uncompressed.vcf | bgziptabix compressed.vcf.gz
ADD COMMENTlink modified 17 months ago by RamRS26k • written 7.3 years ago by Erik Garrison2.2k
7
gravatar for miaowzai
2.1 years ago by
miaowzai210
United States
miaowzai210 wrote:
bgzip file.vcf       # or:   bcftools view file.vcf -Oz -o file.vcf.gz
tabix file.vcf.gz    # or:   bcftools index file.vcf.gz

this is convenient where tabix and bgzip are not installed. saw this from: https://github.com/samtools/bcftools/issues/668

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by miaowzai210
3
gravatar for  DataFanatic
2.5 years ago by
DataFanatic150
DataFanatic150 wrote:
bgzip genotypes.vcf && tabix -p vcf genotypes.vcf.gz

see details if you need to:

https://qtltools.github.io/qtltools/pages/input_files.html

ADD COMMENTlink modified 18 months ago by zx87549.1k • written 2.5 years ago by DataFanatic150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1966 users visited in the last hour