Question: What is SIFT-score?
gravatar for mangfu100
5.7 years ago by
Korea, Republic Of
mangfu100730 wrote:


I am wondering a term with SIFT score.

I think that SIFT refers to some measurement of SNPs,  and while reading Annovar paper, I saw below sentence as follows : 

Finally, Annovar can filter specific variants such as SNPs with >1% frequency in the 1000 Genomes Projects, or non-synonymous SNPs with SIFT scores > 0.05.

Regarding above sentence, I ask you two questions.

1) I think that 1% frequency is a little bit low allele frequency. Dose it have an effect to filtering irrelevant snp variants? I don't think so..

2) SIFT-score threshold is about 0.05 as shown in above sentence. What does SIFT means about and threshold of 0.05 might be effect on filtering variants?


sequencing alignment next-gen • 22k views
ADD COMMENTlink modified 4.7 years ago by r0ntu50 • written 5.7 years ago by mangfu100730
gravatar for John
5.7 years ago by
John12k wrote:

SIFT and PolyPhen are the two most commonly used algorithms for predicting if a SNP has a (generally negative) effect on protein structure. Due to the nature of the redundant genomic code, many SNPs never translate into any effect in the protein - far more than you would expect by chance - because variations which effect protein sequence are usually under negative selection pressure - so SIFT/PloyPhen can be used to weed out a lot of irrelevant stuff from a very large list of candidate variations.

If i'm not mistaken (and its been a long time since i used either, so i might be making this up) SIFT's algorithm gives more weighting to variations which change the net charge of the protein, while PolyPhen uses aminoacid or base conservation to determine relative importance. Both obviously rank premature stop variants and other nonsense variants very highly - so often there is a lot of overlap.

Again, it's been a long time since I used either, i might have gotten that the wrong way around. But i can tell you this - I spent 3 years studying single-basepair exon variants in consanguineous families with a known phenotype, and very very very rarely did SIFT or PolyPhen ever guess the correctly the variant from a list. I wouldn't say they are junk, they're not, but variants which caused transcription factor non-specificity, splicing variants, RNAPol destabilisation, etc, are completely ignored. Do not rely on SIFT and Polyphen for anything other than ordering a candidate list for follow-up analysis :)

ADD COMMENTlink modified 5.7 years ago • written 5.7 years ago by John12k

Hi John,

Would you use VEP, as well as looking at the genomic location (e.g. splice sites), for prediction of the functional effects that are overlooked by Polyphen and SIFT?

Many thanks,


ADD REPLYlink written 4.8 years ago by morgen10

Yes I would, since it also runs PolyPhen/SIFT on your input, but also gives other useful hints in addition as you rightly say. These days, i'd point everyone to VEP :)

ADD REPLYlink written 4.7 years ago by John12k
gravatar for Emily_Ensembl
5.7 years ago by
Emily_Ensembl21k wrote:

To answer your first question, 1% is the standard cutoff used to describe the difference between "common" and "rare" variants. Depending on your study, you might want to change that. For example, in a GWAS for a common trait, you might be interested only in variants that are above a certain frequency in the population, whereas if you're looking at rare Mendelian traits you might only want very low frequency variants. You may also want to narrow this down to a specific population, eg for a GWAS in African Americans, you would be interested in variants common in African populations. Steve mentioned the VEP, which allows you to filter variants by frequency, choosing your own frequency, > or < and pick a population of interest.

ADD COMMENTlink written 5.7 years ago by Emily_Ensembl21k
gravatar for Steve Lianoglou
5.7 years ago by
Steve Lianoglou5.0k
Steve Lianoglou5.0k wrote:

The first line from the SIFT website says:

  SIFT predicts whether an amino acid substitution affects protein function.

It is a method to help curate "variants of interest" in coding regions. 

Other such tools include:

Googling around will uncover others ...

ADD COMMENTlink written 5.7 years ago by Steve Lianoglou5.0k
gravatar for r0ntu
4.7 years ago by
United States/Baltimore/Johns Hopkins University
r0ntu50 wrote:

If you're looking for the effect of SNPs on protein function, you can probably use CRAVAT. It provides a VEST pathogenecity score that enumerates its functional impact. Refer to this post for more details about the tool.

ADD COMMENTlink written 4.7 years ago by r0ntu50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1802 users visited in the last hour