VCFtools not filtering VCF file by positions
1
0
Entering edit mode
8.5 years ago
devenvyas ▴ 740

So I downloaded a gigantic gzipped VCF from the Mota man genome (ftp://biodisk.org/Store/Genome/African/Mota_man/Bam_and_VCF/GB20_sort_merge_dedup_l30_IR_q30_mapDamage_Entire.vcf.gz).

I want to filter it down to just the sites where I have data from the Human Origins SNP array. I make a file called positions.txt, which has chromosome TAB basepair for all the SNPs I have data one. A lot of the SNPs don't have rs #s, so that route won't work. Luckily everything is Hg19. Here are the first few lines of positions.txt

1    842013
1    891021
1    903426
1    949654
1    1018704

I run

vcftools --gzvcf ../GB20_sort_merge_dedup_l30_IR_q30_mapDamage_Entire.vcf.gz --positions positions.txt --recode --out Mota_HuOrg

and then I get an error message as follows

VCFtools - v0.1.12b
(C) Adam Auton and Anthony Marcketta 2009

Parameters as interpreted:
    --gzvcf ../GB20_sort_merge_dedup_l30_IR_q30_mapDamage_Entire.vcf.gz
    --out Mota_HuOrg
    --positions positions.txt
    --recode

Using zlib version: 1.2.3
Versions of zlib >= 1.2.4 will be *much* faster when reading zipped VCF files.
After filtering, kept 1 out of 1 Individuals
Outputting VCF file...
After filtering, kept 0 out of a possible -1563604250 Sites
File does not contain any sites
Run Time = 11431.00 seconds

Does anyone know what I am doing wrong? Thanks!

vcf vcftools SNP • 9.7k views
ADD COMMENT
0
Entering edit mode

Still, thank you very much for the post. I already know how to filter by position.

ADD REPLY
4
Entering edit mode
8.5 years ago
devenvyas ▴ 740

I figured it out. My position list was plain numbers. The VCF file included chr before each chromosome number. I added chr to positions files, and it started working, but it seems to be jettison other information from the VCF files...

ADD COMMENT

Login before adding your answer.

Traffic: 2018 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6