Question

updated clinical variant annotation software and databases

0

Entering edit mode

5.3 years ago

cocchi.e89 ▴ 270

Hi all!

I know it may sound strange, but I would like to ask you all what software do you use for clinical annotation of variants.

I know and use annovar, but I think there are plenty of and I would like to have a "global" idea of disposable software and DBs from this fantastic forum!

Thanks a lot in advance!

clinvar clinical annotation annovar ngs exome • 3.3k views

ADD COMMENT • link updated 5.3 years ago by manuel.belmadani ★ 1.3k • written 5.3 years ago by cocchi.e89 ▴ 270

score 1 · Answer 1 · 2019-01-16

My go to is Annovar, because there's tons of databases available and it's fairly flexible and powerful. Since you're interested in clinical variant annotations (presumably ClinVar), it's worth pointing out that you can create your own up-to-date ClinVar databases for Annovar.

From: http://annovar.openbioinformatics.org/en/latest/user-guide/filter/#-metalr-annotation

Although I periodically update ClinVar database in ANNOVAR for help users perform annotation, due to the frequent update schedule of ClinVar, users are advised to create a database yourself using the prepare_annovar_user.pl tool. An example procedure is given below[...]

See the ClinVar FTP server: ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/

This is important because ClinVar can change within a few weeks, so running an few year old ClinVar database may leave you with incomplete information. There's also other Annovar databases that might be clinically relevant, such as gnomAD/ExAC for population frequencies, or COSMIC for somatic cancer mutations.

Ensembl VEP is also quite popular and supports ClinVar I believe, but I haven't used it: https://uswest.ensembl.org/info/docs/tools/vep/vep_formats.html

You may also want to combine your input variants in a VCF format to external VCF files (which incidentally COSMIC, ClinVar, gnomAD all provide) and use some annotation tool, such as Vcfanno.

[...]we describe vcfanno, which flexibly extracts and summarizes attributes from multiple annotation files and integrates the annotations within the INFO column of the original VCF file.

Publication: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0973-5

Once in a while (when all else fails, or I'm in a hurry), I simply do it manually either using a bash scripting or R to mash non-standard annotations (for example, manually curated phenotypes) into their it's own column of a tab-delimited version of the VCF file. I would still recommend using some standard tool if possible, but sometimes it's not worth it or just quicker to do it that way.