Question: How to combine chromosome vcf files
gravatar for williamsbrian5064
2.7 years ago by
williamsbrian5064320 wrote:


Easy question here. I am split up my vcf file up by chromosome to try and save some time. I want to combine the files now to have one complete vcf file with all the chromosomes. Do I wanted to merge the files or do I want to concatenate the files?

I just ran the command

cat *.vcf > cat.chr1.chr2.vcf

This was only on two of the chromosomes and when I went to look at the files, there was only data for the first chromosome and nothing for the second. Am I doing something wrong here? I could also try vcf tools if this "cat" command wont work for what I am trying to do.

ADD COMMENTlink modified 2.7 years ago by Pierre Lindenbaum134k • written 2.7 years ago by williamsbrian5064320
gravatar for Pierre Lindenbaum
2.7 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum134k wrote:

first alternative, use one vcf file to get the header, and the concatenante all the other vcf without the header (this was your error)

grep '^#' chr1.vcf > merge.vcf
grep -v '^#' chr1.vcf  chr2.vcf chr3.vcf chr4.vcf   >> merge.vcf

second alternative: use picard gatherVcfs, which is going to check the headers, orders etc...

java -jar picard.jar GatherVcfs I=chr1.vcf  I=chr2.vcf I=chr3.vcf I=chr4.vcf  O=merged.vcf
ADD COMMENTlink written 2.7 years ago by Pierre Lindenbaum134k

Wonderful Pierre! Worked like a charm! Thank you so much for the help!

ADD REPLYlink written 2.7 years ago by williamsbrian5064320

Hi Pierre, I wanted to merge VCF files which had already splitted by chromosones into a single file. Though each is compressed i.e. chr1.vcf.gz, chr2.vcf.gz, ......chrn.vcf.gz.

I could not see any an output file merged.vcf.gz when I used the command. Is it because the file is compressed?

java -jar /programs/picard-tools-2.18.11/picard.jar GatherVcfs I=chr1.vcf.gz I=chr2.vcf.gz O=merged.vcf.gz
ADD REPLYlink written 2.1 years ago by mab65880

Hi Pierre, do you know if it's possible to use GatherVCFs with a list of files rather than a separate I= for each? Thanks!

ADD REPLYlink written 12 months ago by Jautis300

try to use a text file WITH THE SUFFIX .list containing the path to the files.

ADD REPLYlink written 12 months ago by Pierre Lindenbaum134k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1577 users visited in the last hour