Question

How To Understand Some Annotated Terms Made By Annovar

0

Entering edit mode

10.2 years ago

zengtony743 ▴ 80

Hi, I have question about annotation by Annovar. It is a little bit confusing for me now when I need to identify variants which are potential to change the function of gene or protein.

Annovar produces one table exonic_variant_function table and variant_function table. The first table lists all variants that are located in exonic coding region which could change gene function but second table lists all variants that are located in region close to exon or exon/intron boundary.

In the second variant annotation table made by annovar software. it produces variants which have been called as the "exonic" and they are explained as " here refers only to coding exonic portion" or "variant overlaps a coding exon ". Why these variants are made in Second table but not exonic variant function table??? what does "exonic" here different from "silent mutation"?

Also, "splicing" is explained as variant that is within 2-bp away from an exon/intron boundary by default and "exonic splicing" is explained as a variant within exon but close to exon/intron boundary. Is that mean exonic splicing has low possibility to induce transcript alternative comparing with "splicing"? Not very clear here.

It is so hard to neglect these "exonic" or "exonic splicing" variants or not without no worry about losing real mutation.

annovar splicing • 5.6k views

ADD COMMENT • link updated 10.2 years ago by Ashutosh Pandey 12k • written 10.2 years ago by zengtony743 ▴ 80

score 3 · Answer 1 · 2014-01-21

The file "variant_function" contains annotated variants based on genomic features they are part of. For example, exonic, intronic, UTRs, up/downstream and intergenic. This file will contain all the variants. The file "exonic_variant_function" contains exonic variants from the first file and have been annotated according to their effect. For example, synonymous, nonsynonymous, frameshift etc. This will contain all the exonic variants.

You may be working with exome sequencing data and thus may not be seeing variants from intergenic regions and thats why your wrote "but second table lists all variants that are located in region close to exon or exon/intron boundary". But in case of whole genome sequencing the "variant function" file will be pretty large in comparison to the "exonic_variant_file".

I don't get what you mean by "what does "exonic" here different from "silent mutation" ?". But all the variants from the "exonic_variant_function" should be present in "variant_function" file.

The "splicing" thing according to author is debatable. He has explained it as ""splicing" in ANNOVAR is defined as variant that is within 2-bp away from an exon/intron boundary by default, but the threshold can be changed by the --splicing_threshold argument. Before Feb 2013, if "exonic,splicing" is shown, it means that this is a variant within exon but close to exon/intron boundary; this behavior is due to historical reason, when a user requested that exonic variants near splicing sites be annotated with splicing as well. However, I continue to get user emails complaining about this behavior despite my best efforts to put explanation in the ANNOVAR website with details. Therefore, starting from Feb 2013 , "splicing" only refers to the 2bp in the intron that is close to an exon, and if you want to have the same behavior as before, add -exonicsplicing argument." (From Annovar website).

I would recommend you to try both -splicing and -exonicsplicing.