Question: VCF - compare
gravatar for manojkumarbioinfo
4.3 years ago by
manojkumarbioinfo60 wrote:


I have 5 set of vcf files aligned using BWA and variants are called using GATK. I'm interested to find for common SNPs that are found in the all 5 vcf file. i cant able to get the common number of SNPs in the vcf file but using vcf compare. but i want to extract only the common SNPs or variants in all 5 vcf file.

can any one help me to find the common variants in my vcf file.

sequencing next-gen • 2.6k views
ADD COMMENTlink modified 4.3 years ago by DG7.1k • written 4.3 years ago by manojkumarbioinfo60
gravatar for DG
4.3 years ago by
DG7.1k wrote:

BCFTools isec will do multiple-file intersection. The output, if I remember correctly, is a tab-delimited format and not a VCF, but it will tell you how many variants overlap and what their positions are, etc. You can specify how many files a variant has to appear in out of the list provided to be reported, so it is easy to run it more than once and get variants that appear in all 5, any 4, etc.

Another option would be bcbio-variation from Brad Chapman's group. It has various subtools that you can use. You can generate summaries of concordance between files as well as construct ensemble call sets where you specify the number of callers (vcf files) a variant had to appear in. The output of bcbio-variation ensemble calling is a VCF file so it can then be directly fed into downstream tools.

ADD COMMENTlink written 4.3 years ago by DG7.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1104 users visited in the last hour