I have vcf files with genomic variants (e.g. SNPs and Indels) from several patients. I now want to merge them into one vcf file e.g. using vcftools' merge-vcf. Doing that I get a file that holds for the union of all variants observed amongst the merged files the data for each individual in the columns:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Ind1 Ind2 Ind3 chr1 583 . G A 8.44 . <..> <..> 1/1 1/0 ./.
However, if there was no variant observed for an individual (because it was in agreement with the reference), it won't have information for this particular position and a "./." is written instead. Now, instead of "./." I want to have the actual genotype printed for the specific variant:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Ind1 Ind2 Ind3 chr1 583 . G A 8.44 . <..> <..> 1/1 1/0 0/1
So all I need to do is go back to the bam file and lookup the genotype of this SNP or Indel.
Is anyone of you aware of a tool that fills in the missing genotypes by looking up the alignment file (e.g. GATK does it for SNPs but not Indels).