Question: Issue with ExAC and 1000g hg38 lifted-over data and systematic failure of annotation softwares!
gravatar for reza.jabal
3.3 years ago by
New York, USA
reza.jabal370 wrote:

ExAC data with hg38 coordinates has been around for filter-based annotation since late 2015, but it seems there is a systematic problem with the use of ExAC and 1000G lifted-over data data for annotation! Mainstream annotation softwares (Annovar, VEP and snpEff) fail to incorporate MAF for variants that their corresponding contig is reversed in the hg38 assembly. As a result, common variants in ExAC and 1000G populations might be misinterpreted as novel variant solely because annotation softwares fail to report corresponding MAF.

I was wondering if anyone here has come across the same problem and if so, how they have tackled this problem?

exac hg38 annotation 1000g • 1.7k views
ADD COMMENTlink modified 3.1 years ago by Pablo1.9k • written 3.3 years ago by reza.jabal370
gravatar for Pablo
3.1 years ago by
Pablo1.9k wrote:

In my opinion, this sounds like a problem in the lift-over procedure, as opposed to a failure in the annotation software.

In order to correctly lift-over variants (e.g. a VCF file from ExAC), not only the coordinates should be changed, but also the variant's REF and ALT fields must be complemented accordingly in reversed genes (I'm talking about WC-complement). If this last part is not done right, downstream annotation software would fail to annotate just because the input is incorrect.

Again, this is an opinion / guess (since there was no sample data in the post, I cannot dig deeper).

ADD COMMENTlink written 3.1 years ago by Pablo1.9k

Hi Pablo, Thanks for your comment! Yes you guessed it right. Deeper investigation of the matter led me to realisation that the problem rather lays in dbsnp liftedover data. I am now using a custom script to fill in missing frequencies.

ADD REPLYlink written 3.1 years ago by reza.jabal370
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1070 users visited in the last hour