GATK tool to merge INDEL with SNPs with the same set of samples
0
0
Entering edit mode
3.7 years ago
MAPK ★ 2.1k

Hi All,

I am trying to merge two VCF files, one with SNPs and the other with INDELS. I was looking at these three methods, but I am not quite clear on which one would be the right option for me. It is likely that these two VCFs have overlapping sites, so I am not sure if Picard would be the right tool. Could someone please help me figure out the right tool.

Option 1.

java -jar picard.jar MergeVcfs I=SNPs.vcf.gz I=INDELS.vcf.gz O=WXS_INDELS_SNPs.vcf.gz

Option 2.

${JAVA} ${JAVAOPTS} -jar ${GATK} GatherVcfs -I SNPs.vcf.gz -I INDELS.vcf.gz -O WXS_INDELS_SNPs.vcf.gz

Option 3

bcftools merge --merge all  SNPs.vcf.gz INDELS.vcf.gz --force-samples -O z -o  WXS_INDELS_SNPs.vcf.gz

PS. I just checked these three methods. I found the results from Option 1 and Option 3 are the same, and GatherVcfs is not suitable for this kind of merge.

VCF GATK • 2.8k views
ADD COMMENT
0
Entering edit mode

Hi,

BCFtools works just fine-

bcftools concat --allow-overlaps SNVs.vcf.gz INDELS.vcf.gz

Merging and concatenation are two different operations. I am assuming since you have same sets of samples in the VCFs and you are trying to join them together, you mean concatenation rather than merging (correct my if I am wrong). Concatenation is vertical joining whereas merging is horizontal joining. GatherVCFs is concatenating whereas other options are merging.

ADD REPLY

Login before adding your answer.

Traffic: 2099 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6