Entering edit mode
3 months ago
Clovering ▴ 30
I would like to merge two vcf files, one with a more limited list of SNPs/regions than the other. The trick being that I would like to merge these two files while only retaining SNPs within regions from the more limited vcf. How would you recommend accomplishing this merge successfully?
Thus far I have worked tried merging with bcftools, but have not found a way to limit the regions to those only included in one vcf:
bcftools merge Pop_filtered.bg.vcf.gz Pop_filtered2.bg.vcf.gz --output-type z --output bcftools_merged.vcf.gz -m both --threads 6
EDIT; Thanks for adding in what you've tried with bcftools.
What have you tried?
Have you read through the bcftools manual?
Ah, yes, thank you, I should have included this and just edited the original post to include more detail. I have merged them with bcftools, however I have not found an option to only merge by locations within one vcf file. Perhaps there is a bocftools command that I am missing?
Change your approach slightly - merge and then locus-filter unless you use a locus filter while merging (you can)
Ah, so merge and then perform a region specific filter utilizing the old vcf? Could you recommend how to do this in bcftools? Would it look something like:
Why are you using
mergewith just one file?
mergeworks on multiple input files. The manual also mentions ways to restrict yourself to the exact word you use ("region"). Have you tried any of those options?
Perhaps I got confused in the wording of the manual, but if I am understanding your suggestion, something like this would filter by regions in the Pop_filtered.bg.vcf.gz file?
You've nailed the filtering part and the
mergepart in this command. This should theoretically work. Let me know if you run into any problems.
Thanks so much, I will give it a try!