Question: Difference between gene-based, region-based and filter-based annotation in ANNOVAR
gravatar for vivekruhela
2.0 years ago by
vivekruhela10 wrote:

Hi everyone. I am working on NGS pipeline and I am going to use ANNOVAR for annotation of variants. But there are three types of annotations in ANNOVAR i.e. gene-based, region based, and filter based. Although I have checked their definition but I am not sure which one is better to use. I have also check one example in ANNOVAR documentation where they have called many databases and use different annotation scheme for different databases. Why?

Can anybody elaborate me the difference, significance and how to use those schemes?


annotation sequence next-gen R gene • 1.1k views
ADD COMMENTlink modified 2.0 years ago by Kevin Blighe56k • written 2.0 years ago by vivekruhela10
gravatar for Kevin Blighe
2.0 years ago by
Kevin Blighe56k
Kevin Blighe56k wrote:

Ignore the names for now and just decide which annotations you want, and then figure out whether they relate to region-, gene-, or filter-based annotations. You can then annotate with multiple different types concurrently. Kai Wang's documentation on the Annovar website is pretty comprehensive in fact, compare to other programs.

Here is code that I re-use a lot, including the code that I use to download the databases. The eventual type of annotation is specified with the -protocol and -operation parameters

#Build the annovar databases
#RefSeq genes
perl /Programs/annovar/ -buildver hg19 -downdb 
-webfrom annovar refGene /Programs/annovar/humandb/ ;

#Cytobanding info
perl /Programs/annovar/ -buildver hg19 
-downdb cytoBand /Programs/annovar/humandb/ ;

perl /Programs/annovar/ -buildver hg19 
-downdb genomicSuperDups /Programs/annovar/humandb/ ;

#Allele frequencies
    #NHLBI-ESP variant frequencies
    perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar esp6500siv2_all /Programs/annovar/humandb/ ;

    #1000 Genomes allele frequencies
    perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar 1000g2015aug /Programs/annovar/humandb/ ;

    #perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar 1000g2014oct /Programs/annovar/humandb/ ;

    #ExAC allele frequencies
    perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar exac03 /Programs/annovar/humandb/ ;
    #Great Middle East
    perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar gme /Programs/annovar/humandb/ ;

perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar snp138 /Programs/annovar/humandb/ ;

#dbSNP with allelic splitting and left-normalisation
perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar avsnp147 /Programs/annovar/humandb/ ;

#SIFT, PolyPhen, and other scores
#perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar ljb26_all /Programs/annovar/humandb/ ;

perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar dbnsfp30a /Programs/annovar/humandb/ ;

perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar cosmic70 /Programs/annovar/humandb/ ;

perl /Programs/annovar/ -buildver hg19 
-downdb -webfrom annovar clinvar_20161128 /Programs/annovar/humandb/ ;

#Annotate with RefSeq genes, cytoband, dbSNP147, et cetera
#   Usage: [arguments] <query-file> <database-location>
#   --protocol <string>, comma-delimited string specifying database protocol
#   --operation <string>,comma-delimited string specifying type of operation
#   --outfile <string>, output file name prefix
#   --buildver <string>, genome build version (default: hg18)
#   --remove, remove all temporary files
#   --otherinfo, print out otherinfo (infomration after fifth column in queryfile)
#   --onetranscript, print out only one transcript for exonic variants (default: all transcripts)
#   --nastring <string>, string to display when a score is not available (default: null)
#   --csvout, generate comma-delimited CSV file (default: tab-delimited txt file)
perl /Programs/annovar/ MyVariants.ann /Programs/annovar/humandb/ 
  -buildver hg19 -remove -otherinfo 
  -protocol refGene,cytoBand,gme,esp6500siv2_all,exac03,dbnsfp30a,avsnp147,cosmic70,clinvar_20161128 
  -operation g,r,f,f,f,f,f,f,f -nastring "NA" -csvout ;
ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Kevin Blighe56k

Thanks sir. I'll read the documentation again. Thanks for providing a good example. Whatever the databased you have invoked through ANNOVAR, are they for WES data or something else. How to determine functional prediction of mutation individually. (other than ANNOVAR way, because I didn't find separate repository of metaSVM or metaLR for functional prediction. Same question for determination of significant somatic mutations.

Thanks again.

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by vivekruhela10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1782 users visited in the last hour