Merging multiple vcfs with GATK's CombineVariants
2
0
Entering edit mode
7.5 years ago
t86dan ▴ 30

Hey guys, im trying to merge multiple vcf files with the Genome Analysis Toolkit (V. 3.5). Usually when dealing with few samples the command would be something like this:

java -jar GenomeAnalysisTK.jar -T CombineVariants -R ../reference.fa --variant sample1.vcf --variant sample2.vcf --variant sample3.vcf --variant sample4.vcf --variant sample5.vcf --variant sample6.vcf -o merge_file.vcf

The problem I have right now is I have many vcf files I want to merge (not just 6 like in the previous example). I have been digging through the command [options] but there is no option to select for example a whole directory with the vcf's in it. Or alternatively write something like ''--variant *.vcf'' so that it selects all of my vcf files and applies the CombineVariants to them.

So in conclusion my question is this: Is typing one by one the vcf files the only way of running this command with many vcfs?

Combinevariants vcf merging • 13k views
ADD COMMENT
8
Entering edit mode
7.5 years ago

It's not documented for as much as I know, but you can use a list as argument to --variant e.g., consider the following:

ls *vcf > vcfs.list
java -jar GenomeAnalysisTK.jar -T CombineVariants -R $REF --variant vcfs.list -o combined.vcf -genotypeMergeOptions UNIQUIFY
ADD COMMENT
0
Entering edit mode
7.5 years ago
t86dan ▴ 30

Thank you very much! I actually figured it out and did it a little different than you just said, although pretty much the same. I did a script listing all the files (I had them listed already so i just added the --variant part to each file). I ended up doing it manually but I guess how I did it was a little less painful than typing one by one in the command line.

java -jar GenomeAnalysisTK.jar \
-T CombineVariants \
-R REFERENCE \
--variant sample1.vcf \
--variant sample2.vcf \
--variant sample3.vcf \
--variant sample4.vcf \
--variant sample5.vcf \
--variant etc.vcf \
-o combined.vcf

Appreciate your feedback. I did it again following your instructions and of course it worked. Now I know how to do the same but with a simple command (ls > list.txt)

ADD COMMENT
2
Entering edit mode

Please use ADD COMMENT to reply to earlier posts, as such this thread remains logically structured and easy to follow.

It's good that you found a solution, although I would argue that yours likely takes longer and is more error prone :-) Good luck with the rest of your analysis.

ADD REPLY

Login before adding your answer.

Traffic: 2249 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6