Question: How to merge VCF files, and consider variants within a set distance as one
1
gravatar for roddy_p
4.9 years ago by
roddy_p10
United Kingdom
roddy_p10 wrote:

I'm trying to merge several VCF files, each with inserts from a different individual. There are inserts that appear in only one individual, yet other individuals have inserts very close by (with a difference of a few nucleotides). I suspect that these inserts are the same. Is there a way of merging these VCF files, and consider variants located within a set distance (e.g. 100 nucleotides) to be the same variant?

Thanks!

 

 

variant vcf • 2.3k views
ADD COMMENTlink modified 4.9 years ago by Dan Gaston7.1k • written 4.9 years ago by roddy_p10
0
gravatar for Dan Gaston
4.9 years ago by
Dan Gaston7.1k
Canada
Dan Gaston7.1k wrote:

There are several options for merging VCF files. The GATK set of tools has CombineVariants and vcftools also has a vcf_merge script. I generally use the CombineVariants option. For your second problem I suspect there are a variety of ways to approach it but I'm not sure what the "best" way is off hand. You could certainly script something fairly readily to filter out everything except indels and select indels close to each other in different samples for reporting. If using Python for instance you could use PyVCF and iterate over variants and samples. PyVCF has fairly extensive documentation.

ADD COMMENTlink written 4.9 years ago by Dan Gaston7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2386 users visited in the last hour