GATK tool to merge INDEL with SNPs with the same set of samples
Entering edit mode
3.5 years ago
MAPK ★ 2.1k

Hi All,

I am trying to merge two VCF files, one with SNPs and the other with INDELS. I was looking at these three methods, but I am not quite clear on which one would be the right option for me. It is likely that these two VCFs have overlapping sites, so I am not sure if Picard would be the right tool. Could someone please help me figure out the right tool.

Option 1.

java -jar picard.jar MergeVcfs I=SNPs.vcf.gz I=INDELS.vcf.gz O=WXS_INDELS_SNPs.vcf.gz

Option 2.

${JAVA} ${JAVAOPTS} -jar ${GATK} GatherVcfs -I SNPs.vcf.gz -I INDELS.vcf.gz -O WXS_INDELS_SNPs.vcf.gz

Option 3

bcftools merge --merge all  SNPs.vcf.gz INDELS.vcf.gz --force-samples -O z -o  WXS_INDELS_SNPs.vcf.gz

PS. I just checked these three methods. I found the results from Option 1 and Option 3 are the same, and GatherVcfs is not suitable for this kind of merge.

VCF GATK • 2.6k views
Entering edit mode


BCFtools works just fine-

bcftools concat --allow-overlaps SNVs.vcf.gz INDELS.vcf.gz

Merging and concatenation are two different operations. I am assuming since you have same sets of samples in the VCFs and you are trying to join them together, you mean concatenation rather than merging (correct my if I am wrong). Concatenation is vertical joining whereas merging is horizontal joining. GatherVCFs is concatenating whereas other options are merging.


Login before adding your answer.

Traffic: 2664 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6