Selecting set of SNPs using vcftools
19 months ago

Hi,

I have a strange problem with vcftools while trying to subset a vcf file for list of SNPs with chromosome number and allele position.

I am trying to subset vcf file only for 2 SNPs in chromosome 11 at positions 72496148 and 95855385. The command I am using is,

"vcftools --vcf chr11.vcf --positions chr11_snp_pos.txt --recode --recode-INFO-all --out SNPs_only"

where, chr11_snp_pos.txt contains following lines,

11 95855385 11 72496148

The strange thing is that, position 95855385 is not returned and position 72496148 is returned twice. In other words I get vcf file with two positions but its 72496148 twice.

I tried to subset the vcf file with only position 95855385 and nothing was returned. I also tried subsetting 95855385 with other positions for example 13357183 and only position 13357183 was returned.

So, only when I have position 72496148 with 95855385, I get subset with repeated 72496148.

I cannot understand what is going on! I hope my question is clear. I would appreciate for any insights!

best wishes, Krishna

19 months ago
bari.ballew ▴ 260

Try altering chr11_snp_pos.txt to have a single tab-delimited position per line, as described in the vcftools documentation. E.g.:

11<tab>95855385
11<tab>72496148


Are there two different alt alleles at position 11:72496148? That could result in more than one line for a single position in the VCF.