Question: VT normalize error with inconsistent fasta file for clinvar20181028 GRCh37 build
1
gravatar for vnttung.iseartclub
2.3 years ago by
vnttung.iseartclub10 wrote:

I keep ran into trouble while trying to normalize a clinvar VCF file with vt program, hg19 build. Tried every latest hg19/v37 reference fasta but still having the same error:

[variant_manip.cpp:96 is_not_ref_consistent] reference bases not consistent: Y:555381-555381 A(REF) vs N(FASTA)
[normalize.cpp:209 normalize] Normalization not performed due to inconsistent reference sequences. (use -n or -m option to relax this)

Do you know anywhere to find the reference file clinvar used to build their latest GRCh37 VCF, or anyway to solve this problem? Much appreciated!

clinvar vt software error • 938 views
ADD COMMENTlink modified 2.3 years ago by finswimmer14k • written 2.3 years ago by vnttung.iseartclub10
0
gravatar for finswimmer
2.3 years ago by
finswimmer14k
Germany
finswimmer14k wrote:

Hello vnttung.iseartclub ,

this position is located in the PAR-Region. This region is usually masked with N on the Y chromosome in the reference files used for alignment. The reasons for that are described in more details in this tutorial.

I wonder how clinVar can be sure that this variant is located on Y and not on X. Nevertheless you have two option:

  1. Ignore variants that are located in the PAR region of the Y chromosome for normalization
  2. Find a reference sequence where this region isn't masked. One way is describe in Which human reference genome should I use?

fin swimmer

PS: @ Bastien Hervé This time I added the link to wiki again ;)

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by finswimmer14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 949 users visited in the last hour
_