break_blocks & GATK switch REF & ALT alleles
21 months ago
dec986 ▴ 300

I'm extracting positions from a gVCF using a bed file with break_blocks, with positions like

break_blocks --region-file $bed --ref human_g1k_v37.fasta --exclude-off-target

and then GATK's selectVariants:

gatk SelectVariants -R human_g1k_v37.fasta -V $vcf -O $out_vcf --remove-unused-alternates

to produce the following output:

1 1273477 1273478 seq-rs797044834 . GGGGGCAGCCGGGT G PASS . GT:GQ 0/0:0.29906204

and a VCF generated from break_blocks & GATK with lines like

1 1273478 . G . . . AN=2;DP=40 GT:DP:MIN_DP 0/0:40:35

the lack of ID & ALT is obviously not acceptable.

However, the problem is with the REF position, as it has mysteriously switched places. This will throw off the integer in the GT position.

Why did the REF switch places with the ALT? How can I fix it?

