Question: Is SNPeff still the standard for variant effect prediction?
4
gravatar for Lauren
3.5 years ago by
Lauren70
Lauren70 wrote:

I'm kind of new to this space-- a friend of mine says he uses SNPeff for all his exome annotations, and he doesn't know of any other popular tools for this purpose.

I'm annotating some human exomes and I am curious about what else is out there. A search gave me a lot of answers, but I don't know which are popular in the community. Are there gaps the SNPeff leaves that other effect predictors fill? Thank you so much for reading my post!

variant snpeff annotation exome • 3.2k views
ADD COMMENTlink modified 20 months ago by Shicheng Guo8.5k • written 3.5 years ago by Lauren70
2

SnpEff is good also try VEP

ADD REPLYlink written 3.5 years ago by Medhat8.8k
8
gravatar for Samuel Brady
3.5 years ago by
Samuel Brady320
Samuel Brady320 wrote:

The tools I hear used most frequently are SnpEff, VEP, and Annovar. This paper (Table 1) shows a comparison of the three tools.

SnpEff tends to be robust and I personally use it the most. Remarkably, SnpEff can effectively annotate even structural variants and long indels, in addition to traditional smaller variants. I've used Annovar once or twice but strange bugs crop up here and there; however the developer of it maintains it well and offers a lot of documentation. VEP seems quite popular, but I personally have the least experience with this one.

ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Samuel Brady320
1

I like ClinVar annotations in ANNOVAR, but I believe you can use SNPeff for custom annotations from .bed files as well.

You can also use custom annotations ANNOVAR, which is what I did for GWAS Catalog associations.

Other than that, I think the answer kind of depends upon your question. For example, I wouldn't use protein function predictions alone to identify a variant candidate as damaging (and you would want to check for pre-mature stop codons and other loss-of-function variants). In practice, I would probably use population frequencies (like 1000 Genomes, gnomAD, etc.), but it would really be best if normal controls were matched by experimental protocol and bioinformatics processing.

ADD REPLYlink written 20 months ago by Charles Warden8.0k

Hi Chrles,

  1. What's the best way to define or identify all the loss-of-functions variations in the human genome?
  2. In your practice, How to use population frequency in GnomAD to define damaging variants?

Thanks.

ADD REPLYlink written 20 months ago by Shicheng Guo8.5k
1

I'm not sure if I know the answer to what is the "best" way to identify loss-of-function variants. A pre-mature stop codon towards the beginning of the gene is probably valid (unless the gene can and does undergo alternative splicing), but I think there are some recommendations here: https://github.com/konradjk/loftee

In my opinion, I think having access to specialized information for previous disease associations is the right solution if you are trying to analyze your own data, but I think "ClinVar" is currently the best thing that I can think of for that.

In terms of getting ideas from current specialized databases, I think these may be some examples to consider:

https://brcaexchange.org/

https://www.cftr2.org/

I think 0.01 or 0.05 would be common frequencies to define rare variants. However, I usually have multiple variant frequency programs. If you see very different frequencies with different reference sets, I would guess the pre-processing and/or variant calling could be a factor.

In general, for discovery, I think you will probably see a higher frequencies of false positives. As long as you have a way to identify a few possibly important mutations per sample, visualization of the alignment is very important.

Otherwise, if you have your own set of cases and controls (collected and processed the same way), you can test for differences for all variants and then check enrichment of variant categories (like loss-of-function). I've also seen people summarize gene counts (for a particular method of annotating variants) and then compare frequencies at the gene level (between their own cases and controls). In that situation, if you knew the gene involved, you could try various methods to see that results in high ranking for your gene (for your particular disease / project).

ADD REPLYlink modified 20 months ago • written 20 months ago by Charles Warden8.0k
1
gravatar for Shicheng Guo
20 months ago by
Shicheng Guo8.5k
Shicheng Guo8.5k wrote:

The State of Variant Annotation: A Comparison of AnnoVar, snpEff and VEP

http://blog.goldenhelix.com/goldenadmin/the-sate-of-variant-annotation-a-comparison-of-annovar-snpeff-and-vep/

ADD COMMENTlink written 20 months ago by Shicheng Guo8.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1988 users visited in the last hour
_