Question: Merging multiple vcf files into one
1
gravatar for hpapoli
3.3 years ago by
hpapoli70
Sweden
hpapoli70 wrote:

Hello,

I have 280 vcf files, each containing about 200 SNPs from a genotyping experiment. I need to merge all these so I can have a final combined vcf where I have all SNPs in all individuals, that is if an individual lacks that SNP, in the combined file it is coded as ./. or .

I am using the following command from vcftools: vcf-merge A.vcf.gz B.vcf.gz C.vcf.gz | bgzip -c > out.vcf.gz

It worked for two files, although it took about 1 hour, now it's been running of 1 day for the whole 280 files. I was wondering if this is the only way of merging a large number of vcf files or if there is any other way to make it more efficient?

Thank you

vcftools vcf • 4.6k views
ADD COMMENTlink modified 3.3 years ago by Pierre Lindenbaum123k • written 3.3 years ago by hpapoli70
3
gravatar for Pierre Lindenbaum
3.3 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum123k wrote:

GATK CombineVariants https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_variantutils_CombineVariants.php

find . -name "*.vcf.gz" > input.list

java -jar GenomeAnalysisTK.jar \
       -T CombineVariants \
       -R ref.fa
       --variant input.list
       -o out.vcf
       -genotypeMergeOptions REQUIRE_UNIQUE
ADD COMMENTlink written 3.3 years ago by Pierre Lindenbaum123k

Thanks for the answer Pierre. Quick question on the matter of reference genome though, let's say we have a collection of VCF files from different times and thus different reference genomes, is there an easy solution with the combinevariants command ro should we lift all the non-compatible ones to a single reference genome and then combine them?

ADD REPLYlink written 2.1 years ago by Nikleotide100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2574 users visited in the last hour