Question: Merging Vcf Chunks After Gatk Unifiedgenotyper?
0
gravatar for 14134125465346445
5.7 years ago by
United Kingdom
141341254653464453.4k wrote:

What is the recommended way of merging the resulting vcf files when running GATK UnifiedGenotyper version 1.6 on chunks of, say, 10M?

I've got files like this:

chr10.0001.vcf
chr10.0002.vcf
chr10.0003.vcf
chr10.0004.vcf
chr10.0005.vcf
chr10.0006.vcf
chr10.0007.vcf
chr10.0008.vcf
chr10.0009.vcf
chr10.0010.vcf
chr10.0011.vcf
chr10.0012.vcf
chr10.0013.vcf
chr10.0014.vcf

where the first file is the first 10M, the second is the following 10M, etc. and I want to end up with a since chr10.vcf file that includes all the ones above.

Is just doing find -name "chr10.*" | sort | xargs cat on the files enough?

vcf gatk vcftools • 2.8k views
ADD COMMENTlink modified 5.7 years ago by Jorge Amigo11k • written 5.7 years ago by 141341254653464453.4k
2
gravatar for Pierre Lindenbaum
5.7 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

use vcf-concat : http://vcftools.sourceforge.net/perl_module.html#vcf-concat

ADD COMMENTlink written 5.7 years ago by Pierre Lindenbaum120k
2
gravatar for Jorge Amigo
5.7 years ago by
Jorge Amigo11k
Santiago de Compostela, Spain
Jorge Amigo11k wrote:

if you are working with GATK you may find the CombineVariants walker very useful. the first example mentioned in the documentation shows how you can merge any set of .vcf files by adding them through the --variant option:

java -Xmx2g -jar GenomeAnalysisTK.jar \
-R ref.fasta \
-T CombineVariants \
--variant input1.vcf \
--variant input2.vcf \
-o output.vcf \
-genotypeMergeOptions UNIQUIFY
ADD COMMENTlink written 5.7 years ago by Jorge Amigo11k

GATK also offers an option to do a smart concatenation of variants that is faster than CombineVariants but safer than regular cat. See http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_tools_CatVariants.html

ADD REPLYlink written 5.7 years ago by vdauwera920
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1656 users visited in the last hour