I'm working with a vcf (v4.1) that has incorrectly formatted deletions for some reason. The insertions are fine, but the deletions are annotated as (example):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 2 32474671 indel.60227 A - . PASS . GT
Notice that the ALT is
-, when the line should have been formatted as such (example):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 2 32474670 indel.60227 GA G . PASS . GT
I have no idea how the deletions ended up like this in the vcf, but my present plan is to parse a reference genome fasta file for these positions and manually correct all the deletion annotations, so I don't have to drop them from the vcf. What I wanted to know is if there's a tool that already does this- as it stands, I'm writing a manual parser.