Question: GATK multi sample variant calling
0
gravatar for cetin.m
14 months ago by
cetin.m50
cetin.m50 wrote:

I am trying to call SNPS from two bam files simultaneously. I want it to be written to the same vcf, with one column for each sample.

I run the following command:

./GenomeAnalysisTK.jar -T UnifiedGenotyper -I sampleA.bam -I sampleB.bam -R /mnt/NEOGENE1/share/ref/genomes/hsa/hs37d5.fa -L /mnt/NAS/projects/2018_MCetin_Selection/imputation/1000G_chr22.bed --output_mode EMIT_ALL_SITES --genotyping_mode GENOTYPE_GIVEN_ALLELES --alleles /mnt/NEOGENE1/share/dna/hsa/genotypes/1000G/ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf -o output.vcf

The program runs without error, but when I look at the output file, it only contains one sample column, named sample1. (It is possibly sampleB.bam in the input, and sampleA calls without problems individually)

What am I doing wrong?

Thank you for reading!

snp gatk variant calling • 310 views
ADD COMMENTlink modified 14 months ago • written 14 months ago by cetin.m50
1
gravatar for Pierre Lindenbaum
14 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum130k wrote:

UnifiedGenotyper is deprecated, use HaplotypeCaller

it only contains one sample column, named sample1.

it's because you flagged your bams with the same read-group '@RG/SN:' same attribute named 'sample1' https://gatkforums.broadinstitute.org/gatk/discussion/6472/read-groups

one way to change this is to rename your samples using picard AddOrReplaceReadGroups: https://broadinstitute.github.io/picard/command-line-overview.html#AddOrReplaceReadGroups

ADD COMMENTlink modified 14 months ago • written 14 months ago by Pierre Lindenbaum130k

Makes a lot of sense!

ADD REPLYlink written 14 months ago by cetin.m50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1344 users visited in the last hour