Question: How to combine a multi sample VCF from multiple sample VCF and interpret a multi sample vcf of bacterial genome?
0
gravatar for S AR
10 months ago by
S AR50
Pakistan
S AR50 wrote:

I used GATK HaplotypeCaller for Variant calling of 2300 MTB strains. Now i want to make it a multi Sample VCF.

I used CombineVariants but im comfuse that either this is correct or not.

Each of my VCF is isolated means 1 VCF from one individual sample. But GATK says: CombineVariants can be used for combine variant calls that were produced from the same samples but using different methods, for comparison.

But my Variant files are from different samples. I just want to make a multi VCF for comparison purpose of each strains variants. The VCF file generated from CombineVariant is :

Mutltisample.vcf I'm not sure is it giving the proper results?

And how to interpret this union.vcf. Like in GT column it is giving ./. what does that means? and its a haploid genome so the GT should contain a single value but few columns are representing the 1|0, 0|1...

Can anyone help?

Thank you

variant calling gatk vcf • 930 views
ADD COMMENTlink modified 10 months ago by Pierre Lindenbaum124k • written 10 months ago by S AR50
0
gravatar for Raony Guimarães
10 months ago by
Dublin / Ireland
Raony Guimarães1.1k wrote:

You could merge the files with bcftools:

bcftools merge -O z -o merged.vcf.gz sample1.vcf.gz sample2.vcf.gz

ADD COMMENTlink written 10 months ago by Raony Guimarães1.1k

command not working!

ADD REPLYlink written 10 months ago by S AR50
0
gravatar for Pierre Lindenbaum
10 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum124k wrote:

But GATK says: CombineVariants can be used for combine variant calls that were produced from the same samples but using different methods, for comparison.

yes CombineVariants can merge vcf with the same samples (--genotypeMergeOptions PRIORITIZE ) but in your case, you need to use the option --genotypeMergeOptions REQUIRE_UNIQUE

Require that all samples/genotypes be unique between all inputs.

ADD COMMENTlink modified 10 months ago • written 10 months ago by Pierre Lindenbaum124k

Hey Pierre, do you have any advice for using GenotypeGVCF for multiple samples from viruses that are similar but have different reference strains. ie in order to improve read alignment slightly different reference strains were used. But I would like to compare them after even though the reference strains are slightly different. thank you

ADD REPLYlink written 4 months ago by gclemd0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1966 users visited in the last hour