Question: subseting VCF by bcftools
gravatar for miaowzai
3.4 years ago by
United States
miaowzai260 wrote:

I'm trying to include only single nucleotide variant, or some say SNPs, from the 1000 genomes project data. (For example, the phase 3 newest release 2013)

From bcftools manual under "view", it says:

-v, --types snps|indels|mnps|other comma-separated list of variant types to select. Site is selected if any of the ALT alleles is of the type requested. Types are determined by comparing the REF and ALT alleles in the VCF record not INFO tags like INFO/INDEL or INFO/VT. Use --include to select based on INFO tags.

I haven't checked the subset file but according to this instruction, "-v" checks the REF and ALT alleles to decide if the variant is SNP or not. I was worried that if multi-allelic single nucleotide sites will be also exluded because the ALT column will have strings with length longer than 1 (e.g. "A,T" at the ALT column).

I tried to set: bcftools view --include 'VT=SNP' in the INFO column, but error message popped out and say

the tag "INFO/SNP" is not defined in the VCF header

My questions are: (1) How can I obtain only variants that have "VT=SNP" in the INFO column? (2) Does -v snps retain variants with low allele frequency? Since SNP means common (allele frequency > 0.1% or 0.2%) single nucleotide variant.


bcftools vcf • 1.6k views
ADD COMMENTlink modified 3.4 years ago by guillaume.rbt830 • written 3.4 years ago by miaowzai260
gravatar for guillaume.rbt
3.4 years ago by
guillaume.rbt830 wrote:


(1) To get only SNP from a vcf I use vcftools :

vcftools --vcf your_vcf.vcf --recode --remove-indels --out output.vcf

(2) I guess the -v snps retains all SNP whatever the allele frequency (in a vcf file even singletons are called "SNP")

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by guillaume.rbt830

Thanks for the answer! I was using vcftools and it works great. But somehow I think bcftools works faster than vcftools. I'm confused with the setting of the command, so I will be waiting for other answers. Thanks!

ADD REPLYlink written 3.4 years ago by miaowzai260
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1394 users visited in the last hour