Question: How to merge/combine multiple vcf from different samples to make one multi sample vcf file using GATK
1
gravatar for S AR
6 months ago by
S AR50
Pakistan
S AR50 wrote:

I have 2300 vcf files from 2300 samples generated from haplotypeCaller of GATK from bacteria. Now i want to combine all VCF into one as multivcf sample file. I used CombineVariants from GATK but it is giving error when i give -V *.vcf.gz. I cant write -V file name 2300 times can anybody help me with this?

 java -Xmx8G -jar /home/sark/GenomeAnalysisTK-3.8.1.0/GenomeAnalysisTK.jar -T CombineVariants -R ../ref/M._tuberculosis_H37Rv_2015-11-13.fasta --interval_padding 50  -V *raw.vcf.gz -L ../effluxgenes.bed -o multiple.vcf

Error:

    ##### ERROR
    ##### ERROR MESSAGE: Invalid argument value 'ERR040140_raw.vcf.gz' at position 8.
    ##### ERROR Invalid argument value 'ERR046796_raw.vcf.gz' at position 9.
    ##### ERROR Invalid argument value 'ERR046903_raw.vcf.gz' at position 10.
    ##### ERROR Invalid argument value 'ERR067581_raw.vcf.gz' at position 11.
    ##### ERROR Invalid argument value 'ERR067593_raw.vcf.gz' at position 12.
    ##### ERROR Invalid argument value 'ERR067606_raw.vcf.gz' at position 13.
    ##### ERROR Invalid argument value 'ERR067607_raw.vcf.gz' at position 14.
    ##### ERROR Invalid argument value 'ERR067608_raw.vcf.gz' at position 15.
    ##### ERROR Invalid argument value 'ERR067609_raw.vcf.gz' at position 16.
   --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    ##### ERROR Invalid argument value 'SRR671854_raw.vcf.gz' at position 173.
    ##### ERROR Invalid argument value 'SRR671855_raw.vcf.gz' at position 174.
    ##### ERROR Invalid argument value 'SRR671866_raw.vcf.gz' at position 175.
    ##### ERROR ------------------------------------------------------------------------------------------

Can anyone help me with loop thing?

scripting loop gatk • 359 views
ADD COMMENTlink modified 6 months ago by Pierre Lindenbaum121k • written 6 months ago by S AR50
4
gravatar for Pierre Lindenbaum
6 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum121k wrote:
 (...) -V *raw.vcf.gz (...)

GATK doesnt work like this.

You need

   -V file1.aw.vcf.gz    -V file2.aw.vcf.gz    -V file3.aw.vcf.gz

or put all the paths in a file with the '.list' suffix

$ ls *raw.vcf.gz > vcf.list

$ java -Xmx8G -jar  (...) -V vcf.list (...)
ADD COMMENTlink written 6 months ago by Pierre Lindenbaum121k

Thank you it worked.

ADD REPLYlink modified 6 months ago • written 6 months ago by S AR50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1123 users visited in the last hour