Question: Generate Vcf.Gz File And Its Index File Vcf.Gz.Tbi
16
gravatar for lyz10302012
5.5 years ago by
lyz10302012270
China
lyz10302012270 wrote:

Can anyone tell me how to generate vcf.gz file and its index file vcf.gz.tbi in 1000 Genomes Project? ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20110521/

vcf • 38k views
ADD COMMENTlink modified 3 months ago by miaowzai50 • written 5.5 years ago by lyz10302012270
39
gravatar for Matt Shirley
5.5 years ago by
Matt Shirley8.4k
Cambridge, MA
Matt Shirley8.4k wrote:
bgzip -c file.vcf > file.vcf.gz
tabix -p vcf file.vcf.gz

tabix documentation

ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by Matt Shirley8.4k

Does the VCF have to be sorted like SAM/BAM does?

ADD REPLYlink written 2.3 years ago by Dan470

Yes, tabix requires sorted input files. I don't think the sorting order matters, but records must be grouped together by rows.

ADD REPLYlink written 2.3 years ago by Matt Shirley8.4k

You mean numeric or alphanumeric by chromosome in ascending or descending order plus numeric by position ascending or descending order? I can't think how else sorting order could not matter.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Dan470

Yes. To clarify, I think you just need your chromosomes grouped together, and then records for each chromosome need to be sorted in ascending coordinate order.

ADD REPLYlink written 2.3 years ago by Matt Shirley8.4k
17
gravatar for Erik Garrison
5.5 years ago by
Erik Garrison2.1k
Somerville, MA
Erik Garrison2.1k wrote:

I have a script which does this using a VCF stream on stdin:

#!/bin/bash

file=$1

bgzip >$file
tabix -f -p vcf $file

I found I was always writing the same lines over and over when indexing VCF files. You can use it like this:

cat uncompressed.vcf | bgziptabix compressed.vcf.gz
ADD COMMENTlink written 5.5 years ago by Erik Garrison2.1k
2
gravatar for  DataFanatic
9 months ago by
DataFanatic100
DataFanatic100 wrote:

bgzip genotypes.vcf && tabix -p vcf genotypes.vcf.gz

see details if you need to: https://qtltools.github.io/qtltools/pages/input_files.html

ADD COMMENTlink modified 9 months ago • written 9 months ago by DataFanatic100
0
gravatar for miaowzai
3 months ago by
miaowzai50
United States
miaowzai50 wrote:
bgzip file.vcf       # or:   bcftools view file.vcf -Oz -o file.vcf.gz
tabix file.vcf.gz    # or:   bcftools index file.vcf.gz

this is convenient where tabix and bgzip are not installed. saw this from: https://github.com/samtools/bcftools/issues/668

ADD COMMENTlink modified 3 months ago • written 3 months ago by miaowzai50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1229 users visited in the last hour