So I downloaded a gigantic gzipped VCF from the Mota man genome (ftp://biodisk.org/Store/Genome/African/Mota_man/Bam_and_VCF/GB20_sort_merge_dedup_l30_IR_q30_mapDamage_Entire.vcf.gz).
I want to filter it down to just the sites where I have data from the Human Origins SNP array. I make a file called positions.txt, which has chromosome TAB basepair for all the SNPs I have data one. A lot of the SNPs don't have rs #s, so that route won't work. Luckily everything is Hg19. Here are the first few lines of positions.txt
1 842013 1 891021 1 903426 1 949654 1 1018704
vcftools --gzvcf ../GB20_sort_merge_dedup_l30_IR_q30_mapDamage_Entire.vcf.gz --positions positions.txt --recode --out Mota_HuOrg
and then I get an error message as follows
VCFtools - v0.1.12b (C) Adam Auton and Anthony Marcketta 2009 Parameters as interpreted: --gzvcf ../GB20_sort_merge_dedup_l30_IR_q30_mapDamage_Entire.vcf.gz --out Mota_HuOrg --positions positions.txt --recode Using zlib version: 1.2.3 Versions of zlib >= 1.2.4 will be *much* faster when reading zipped VCF files. After filtering, kept 1 out of 1 Individuals Outputting VCF file... After filtering, kept 0 out of a possible -1563604250 Sites File does not contain any sites Run Time = 11431.00 seconds
Does anyone know what I am doing wrong? Thanks!