Question: VT normalize error with inconsistent fasta file for clinvar20181028 GRCh37 build
I keep ran into trouble while trying to normalize a clinvar VCF file with vt program, hg19 build. Tried every latest hg19/v37 reference fasta but still having the same error:

[variant_manip.cpp:96 is_not_ref_consistent] reference bases not consistent: Y:555381-555381 A(REF) vs N(FASTA)
[normalize.cpp:209 normalize] Normalization not performed due to inconsistent reference sequences. (use -n or -m option to relax this)

Do you know anywhere to find the reference file clinvar used to build their latest GRCh37 VCF, or anyway to solve this problem? Much appreciated!

clinvar vt software error • 938 views
Hello vnttung.iseartclub ,

this position is located in the PAR-Region. This region is usually masked with N on the Y chromosome in the reference files used for alignment. The reasons for that are described in more details in this tutorial.

I wonder how clinVar can be sure that this variant is located on Y and not on X. Nevertheless you have two option:

  1. Ignore variants that are located in the PAR region of the Y chromosome for normalization
  2. Find a reference sequence where this region isn't masked. One way is describe in Which human reference genome should I use?

fin swimmer

