How to merge vcf files (output of VarScan2 somatic)?
1
0
Entering edit mode
2.8 years ago
Raheleh ▴ 200

Hello,

I have WES data of tumor samples with matched ones extracted from 13 patients (paired-end, illumina). I used BWA mem to align them against hg38, and used VarScan2 to call somatic variations. Now I have 6 files (fpfilter_Passed.vcf) for each patient (Somatic, LOH, Germline for each of snp and indel variations).

I am trying to merge the vcf files of each patient using CombineVariants and MergeVcfs (Picard) of gatk, but no success yet. Can any one tell me if I am on the right way? Is there any other tools that I can try?

Thanks for any help!

varscan2 merge vcf SNP indel • 2.0k views
ADD COMMENT
3
Entering edit mode
2.8 years ago

bcftools merge should be the answer ;)

For this you have to gzip and index each vcf file first

$ bgzip -c sample1.vcf > sample1.vcf.gz
$ tabix sample1.vcf.gz

Then you can use:

$ bcftools merge sample1.vcf.gz sample2.vcf.gz ... > merged.vcf

I am trying to merge the vcf files of each patient using CombineVariants and MergeVcfs (Picard) of gatk, but no success yet.

What was the problem?

fin swimmer

ADD COMMENT
0
Entering edit mode

many thanks for your reply. This is my command:

java -jar GenomeAnalysisTK.jar -T CombineVariants -R hg38.fa --variant Germline.hc.fpfilterPassed.vcf --variant Somatic.hc.fpfilterPassed.vcf -o output.vcf -genotypeMergeOptions UNIQUIFY

And this is the error when I am using CombineVariants:

Error: Unable to access jar file GenomeAnalysisTK.jar

there is no any jar file with the mentioned name in the downloaded package (gatk-4.0.11.0). after many search I found that may be replacing java -jar GenomeAnalysisTK.jar by gatk can solve the problem. But I got this error:

A USER ERROR has occurred: '-T' is not a valid command.

I tried without -T, but no success.


Regarding MergeVCFs tools: This is the command: (I got this command from the recommended workflow of the website)

java -jar picard.jar MergeVcfs I=Germline.hc.fpfilterPassed.vcf  I=Somatic.hc.fpfilterPassed.vcf O=Output.vcf.gz

When I run the command I got this:

NOTE: Picard's command line syntax is changing.


** For more information, please see: ** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)


** The command line looks like this in the new syntax:


** MergeVcfs -I Germline.hc.fpfilterPassed.vcf -I Somatic.hc.fpfilterPassed.vcf -O Output.vcf.gz

I tried this command:

  java -jar picard.jar MergeVcfs -I Germline.hc.fpfilterPassed.vcf  -I Somatic.hc.fpfilterPassed.vcf -O Output.vcf.gz

But I got this error:

ERROR: Invalid argument '-I'

Please help me to get out of this problem. Many thanks!

ADD REPLY
2
Entering edit mode

Hello,

gatk's CombineVariants isn't available in gatk4. And I think this tool doesn't do what you like to do. It's for combining variants from the same sample into a single file. But you like to have different samples in one file.

The message by MergeVcfs is just a notice. It should produce a valid output.

fin swimmer

ADD REPLY
0
Entering edit mode

Thank you fin swimmer. Yes it produced an output but just headers are there no more.

Edit: Sorry I made a mistake. It produces nothing. I mean after getting that notice, I get no outputs.

ADD REPLY

Login before adding your answer.

Traffic: 1580 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6