How to merge/combine multiple vcf from different samples to make one multi sample vcf file using GATK
1
1
Entering edit mode
5.3 years ago
S AR ▴ 80

I have 2300 vcf files from 2300 samples generated from haplotypeCaller of GATK from bacteria. Now i want to combine all VCF into one as multivcf sample file. I used CombineVariants from GATK but it is giving error when i give -V *.vcf.gz. I cant write -V file name 2300 times can anybody help me with this?

 java -Xmx8G -jar /home/sark/GenomeAnalysisTK-3.8.1.0/GenomeAnalysisTK.jar -T CombineVariants -R ../ref/M._tuberculosis_H37Rv_2015-11-13.fasta --interval_padding 50  -V *raw.vcf.gz -L ../effluxgenes.bed -o multiple.vcf

Error:

    ##### ERROR
    ##### ERROR MESSAGE: Invalid argument value 'ERR040140_raw.vcf.gz' at position 8.
    ##### ERROR Invalid argument value 'ERR046796_raw.vcf.gz' at position 9.
    ##### ERROR Invalid argument value 'ERR046903_raw.vcf.gz' at position 10.
    ##### ERROR Invalid argument value 'ERR067581_raw.vcf.gz' at position 11.
    ##### ERROR Invalid argument value 'ERR067593_raw.vcf.gz' at position 12.
    ##### ERROR Invalid argument value 'ERR067606_raw.vcf.gz' at position 13.
    ##### ERROR Invalid argument value 'ERR067607_raw.vcf.gz' at position 14.
    ##### ERROR Invalid argument value 'ERR067608_raw.vcf.gz' at position 15.
    ##### ERROR Invalid argument value 'ERR067609_raw.vcf.gz' at position 16.
   --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    ##### ERROR Invalid argument value 'SRR671854_raw.vcf.gz' at position 173.
    ##### ERROR Invalid argument value 'SRR671855_raw.vcf.gz' at position 174.
    ##### ERROR Invalid argument value 'SRR671866_raw.vcf.gz' at position 175.
    ##### ERROR ------------------------------------------------------------------------------------------

Can anyone help me with loop thing?

GATK loop scripting • 2.4k views
ADD COMMENT
5
Entering edit mode
5.3 years ago
 (...) -V *raw.vcf.gz (...)

GATK doesnt work like this.

You need

   -V file1.aw.vcf.gz    -V file2.aw.vcf.gz    -V file3.aw.vcf.gz

or put all the paths in a file with the '.list' suffix

$ ls *raw.vcf.gz > vcf.list

$ java -Xmx8G -jar  (...) -V vcf.list (...)
ADD COMMENT
0
Entering edit mode

Thank you it worked.

ADD REPLY

Login before adding your answer.

Traffic: 2618 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6