Question: How to combine chromosome vcf files
0
gravatar for williamsbrian5064
21 months ago by
williamsbrian5064230 wrote:

Hi,

Easy question here. I am split up my vcf file up by chromosome to try and save some time. I want to combine the files now to have one complete vcf file with all the chromosomes. Do I wanted to merge the files or do I want to concatenate the files?

I just ran the command

cat *.vcf > cat.chr1.chr2.vcf

This was only on two of the chromosomes and when I went to look at the files, there was only data for the first chromosome and nothing for the second. Am I doing something wrong here? I could also try vcf tools if this "cat" command wont work for what I am trying to do.

ADD COMMENTlink modified 21 months ago by Pierre Lindenbaum127k • written 21 months ago by williamsbrian5064230
5
gravatar for Pierre Lindenbaum
21 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum127k wrote:

first alternative, use one vcf file to get the header, and the concatenante all the other vcf without the header (this was your error)

grep '^#' chr1.vcf > merge.vcf
grep -v '^#' chr1.vcf  chr2.vcf chr3.vcf chr4.vcf   >> merge.vcf

second alternative: use picard gatherVcfs http://broadinstitute.github.io/picard/command-line-overview.html, which is going to check the headers, orders etc...

java -jar picard.jar GatherVcfs I=chr1.vcf  I=chr2.vcf I=chr3.vcf I=chr4.vcf  O=merged.vcf
ADD COMMENTlink written 21 months ago by Pierre Lindenbaum127k

Wonderful Pierre! Worked like a charm! Thank you so much for the help!

ADD REPLYlink written 21 months ago by williamsbrian5064230

Hi Pierre, I wanted to merge VCF files which had already splitted by chromosones into a single file. Though each is compressed i.e. chr1.vcf.gz, chr2.vcf.gz, ......chrn.vcf.gz.

I could not see any an output file merged.vcf.gz when I used the command. Is it because the file is compressed?

java -jar /programs/picard-tools-2.18.11/picard.jar GatherVcfs I=chr1.vcf.gz I=chr2.vcf.gz O=merged.vcf.gz
ADD REPLYlink written 14 months ago by mab65860

Hi Pierre, do you know if it's possible to use GatherVCFs with a list of files rather than a separate I= for each? Thanks!

ADD REPLYlink written 7 weeks ago by Jautis290

try to use a text file WITH THE SUFFIX .list containing the path to the files.

ADD REPLYlink written 7 weeks ago by Pierre Lindenbaum127k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1780 users visited in the last hour