Question: Using Tabix And Vcf Tools To Get Cnv / Sv Frequencies From 1000 Genomes Data
2
gravatar for Ryan D
7.9 years ago by
Ryan D3.3k
USA
Ryan D3.3k wrote:

I read this excellent post by Stephen on getting data from 1000 genomes with tabix, but it seems to not be working for me. I use tabix to get the data in the following manner:

tabix -fh ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ALL.chr22.phase1_integrated_calls.20101123.snps_indels_svs.genotypes.vcf.gz 22:1000000-10000000 > ~/delete.vcf

It gets a vcf file, but the file only seems to have headers, no variants info... like so:

##INFO=<ID=SNPSOURCE,Number=.,Type=String,Description="indicates if a snp was called when analysing the low coverage or exome alignment data">
##reference=GRCh37
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  HG00096...

So after running vcftools, obviously I get nothing since there is not genotype data:

vcftools --vcf ~/delete.vcf --freq --out ~/delete.txt

VCFtools - v0.1.7
(C) Adam Auton 2009

Parameters as interpreted:
        --vcf /home/delahar/delete.vcf
        --freq
        --out /home/delahar/delete.txt

Reading Index file.
File contains 0 entries and 1092 individuals.
Applying Required Filters.
After filtering, kept 1092 out of 1092 Individuals
After filtering, kept 0 out of a possible 0 Sites
Error:No data left for analysis!

I'm guessing this is an issue with the way I'm using tabix. Ultimately I want to get the fields that have VT=SV in their column. So extra help on getting that would be greatly appreciated.

Thanks,

Rx

genome tabix vcftools cnv • 3.9k views
ADD COMMENTlink modified 5.7 years ago by Biostar ♦♦ 20 • written 7.9 years ago by Ryan D3.3k

I have heard from 2 people that they can't get Tabix to retrieve data from the internet...

ADD REPLYlink written 7.9 years ago by Zev.Kronenberg11k
4
gravatar for Adam
7.9 years ago by
Adam990
United States
Adam990 wrote:

Your tabix command is returning no data as there are no SNPs in that region of chr22. The first SNPs on chr22 are around the 16Mb mark. Try:

tabix -fh <ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ALL.chr22.phase1_integrated_calls.20101123.snps_indels_svs.genotypes.vcf.gz> 22:1000000-16052250

And see if that returns some data.

ADD COMMENTlink written 7.9 years ago by Adam990

Thanks. I finally figured it out. You were exactly right.

ADD REPLYlink written 7.9 years ago by Ryan D3.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1804 users visited in the last hour