Question: Sift Or Polyphen
5
gravatar for Dataminer
9.4 years ago by
Dataminer2.7k
Netherlands
Dataminer2.7k wrote:

If one has to select only one server from SIFT and PolyPhen (both are used for identification of functional missense mutation). which one to go for?

Note: Rule is you can select only one and not both.

non mutation snp • 24k views
ADD COMMENTlink modified 8 months ago by always_learning1.0k • written 9.4 years ago by Dataminer2.7k
1

Just in case people are still using the old sift through its old host, here is the new and probably faster host - sift-dna.org/

ADD REPLYlink written 9.3 years ago by Prateek1.0k

Where can we download the Pre-built Polyphen Score? http://genetics.bwh.harvard.edu/pph2/dokuwiki/downloads

Does it seem that I have to process these data to generate the final file?

Regards, Najeeb

ADD REPLYlink written 8 months ago by always_learning1.0k

Please do not add questions in existing threads and do not use the answer field for anything except answers.

ADD REPLYlink written 8 months ago by ATpoint36k

Where can we download the Pre-built Polyphen Score? http://genetics.bwh.harvard.edu/pph2/dokuwiki/downloads

Does it seem that I have to process these data to generate the final file?

Regards, Najeeb

ADD REPLYlink written 8 months ago by always_learning1.0k

I already added this comment to your abuse of the answer field yesterday in a different thread:

Please do not add questions in existing threads and do not use the answer field for anything except answers.

Please stop doing that. You are free to comment on existing threads but questions should be posted in a new thread, showing the necessary effort and providing the necessary details.

ADD REPLYlink modified 8 months ago • written 8 months ago by ATpoint36k
8
gravatar for Programmer
9.4 years ago by
Programmer110
Manchester, UK
Programmer110 wrote:

A soon to be published Human Mutation Article suggests that Polyphen is less dependent on the multiple alignment used as input. If you are not able to produce your own alignments for your specific dataset then Polyphen could perhaps be preferred for this reason.

On the other hand, if you can produce your own alignments then SIFT might be preferable since its web UI lets you specify the alignment and with the correct alignment its results are at least comparable in accuracy to Polyphen.

ADD COMMENTlink written 9.4 years ago by Programmer110

In our experience of these tools the alignment has a huge effect on the ability to do prediction. Even down to the species used in the alignment. That HMA article is very similar to the work we did (unpublished). Interesting, thanks for the link.

ADD REPLYlink written 9.4 years ago by Daniel Swan13k
5
gravatar for Daniel Swan
9.4 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

Why don't you test them both on a dataset that you already know the results for and see which one gives you the closest match to what you expect?

You never know someone may even have done this before already and published it

People will use both SIFT and PolyPhen2, or both. If you use both you will get a certain subset of predictions that overlap, and each will give some unique to the algorithm.

ADD COMMENTlink written 9.4 years ago by Daniel Swan13k
3

We have unpublished results along similar lines to teh paper below as well. Interestingly these did suggest Polyphen-2 was "worse" than Polyphen-1, due to a wider variance in accuracy depending on the gene involved. This was seemingly due to "poor" alignments being generated for some genes in our datasets. It is quite hard to put together a fair benchmark, but our study suggested that Polyphen-2's best case accuracy was better than Polyphen-1's, but for our data the average-case accuracy and variance were worse. This was using default settings in the web UI.

ADD REPLYlink written 9.4 years ago by Programmer110
1

Would you care to qualify that with some evidence? "feeling" something is better is not enough ;)

ADD REPLYlink written 9.4 years ago by Daniel Swan13k

Well both SIFT and PolyPhen are complimentary in approach that is why they are used together. And getting an overlap of results from both servers is not always helpful. Some how I feel PolyPhen-I was better compared to the PolyPhen-II.

ADD REPLYlink written 9.4 years ago by Dataminer2.7k
5
gravatar for Jarretinha
9.4 years ago by
Jarretinha3.3k
São Paulo, Brazil
Jarretinha3.3k wrote:

Honestly, both are misleading. Unless you must annotate SNPs with them, try another approach. Check out this article. Of course, this article is mostly about clinical applications. Nevertheless, I'm currently researching some (proto)oncogenes and I can say for sure that relying in alignments in order to look for phenotipic effects is too risky. This kinf of approach totally ignores correlations among positions. They are quite common and you see in this conservative analysis.

My advice is: just search the literature about the regions that you're studyin. Most high quality SNP data with phenotypic annotation isn't present on the popular public databases. E. g. The protein menin have about 150 variants annotated in UniProt when you can find about a 1000 of them (true protein level evidence) in the specialized literature.

SIFT and Polyphen2 can say opposite thing about the same SNP. How to decide who's right? So, work a bit more to suffer a lot less!!!

-- Edit --

There are other several drawbacks in both approaches. They don't correct for parology or or redundancy. This can give much higher weight to certain alignment positions than should be. But, we know that using alignment as a proxy of purifying selection only works well for low redundancy distant species sets as pointed here. The rate of sucess of SIFT/Polyphen2 is mainly due to evident constraints in protein structure. You don't need them to assess that. A 2ndry structure prediction program with a good profile guided alignment should return very similar results in a much more transparent way.

ADD COMMENTlink modified 9.3 years ago • written 9.4 years ago by Jarretinha3.3k
2

+1 "SIFT and Polyphen2 can say opposite thing about the same SNP". Yes Jarretinha, you are extremely since they are complementary approach, we can't say which one is right or wrong. I have seen several such examples.

ADD REPLYlink written 9.4 years ago by Khader Shameer18k
1

Shameer, If you have to choose one of them which one you will choose and why?

ADD REPLYlink written 9.4 years ago by Dataminer2.7k

can i ask why it has to be SIFT or polyphen as it seems very restrictive and a bit unrealistic. we could perhaps help more if we knew the constraints you were operating under. However you might be interested to know that recent studies have shown that there is no link between the effect of a SNP on protein stability and the deleteriousness of a SNP. I.e just because polyphen classes a non synonymous amino acid substitution as malignant there is no increased likelihood the SNP will be deleterious. bmc bioinformatics 2009(10) s9

ADD REPLYlink written 9.4 years ago by User 6659970

I'm quite aware of these issues. We have chaperones and related, right? But, SNPs can have an effect (e. g thalassemia, falciform anaemia). It would be very nice to predict it within certain bounds at least. Any better clues?

ADD REPLYlink written 9.4 years ago by Jarretinha3.3k

sorry is that question at me or snpminer? Please can you rephrase it as if it is directed at me, I don't understand the question. Incidentally my comment was directed at the OP not at you. I thought he might be interested in the BMC paper. I'm familiar with you as a member of this forum and I didn't think you weren't aware of these issues

ADD REPLYlink written 9.4 years ago by User 6659970

Also what are the SNP effects in thalassemia and falciform anaemia you refer to? I'm not aware of the impact of SNPs in these diseases. Are they non synonymous SNPs? Actually the thalassemia one is ringing a dim and distant bell now you mention it :)

ADD REPLYlink written 9.4 years ago by User 6659970

Sorry for my latin! Falciform anaemia = sickle cell disease. As some thalassemias, it's caused (for sure) by a single non-synonymous substitution in the beta globin genes. It exists as a SNP with relatively high frequency in certain populations. Anyway, my comment still targets the question (mainly). And you citation suggests three more articles in BMC. Two by Ludwig people from Brasil! Nice!!!

ADD REPLYlink written 9.4 years ago by Jarretinha3.3k
4
gravatar for Michi
9.4 years ago by
Michi950
Barcelona
Michi950 wrote:

Hi

It doesnt have to be either or. Condel integrates different outputs (like SIFT and Polyphen2). So just you can just run both and integrate them.

But sticking to your rule of using one server: From Ensembl 62 on you should be able to access directly to an integrated score of polyphen2 and sift (calculated by Condel) through their API (stated here).

UPDATE:

Now this option is already available at Ensembl: On the webserver you can access it directly or you can query teh API yourself.

ADD COMMENTlink modified 9.3 years ago • written 9.4 years ago by Michi950
3
gravatar for Bamyasi
9.3 years ago by
Bamyasi150
Boston, MA
Bamyasi150 wrote:

They don't correct for parology or or redundancy.

This is not true, at least not in case of PolyPhen-2. It uses PSIC conservation score which is very robust and was specifically designed with highly redundant alignments in mind. PolyPhen-2 also has options to correct for paralogs (or use clean target database for true orthologs). Benchmarks show paralogs correction actually deteriorate accuracy slightly so this option is disabled by default.

The rate of success of SIFT/Polyphen2 in mainly due to evident constraints in protein structure.

SIFT does not use any secondary structure features for its predictions.

ADD COMMENTlink written 9.3 years ago by Bamyasi150
1
gravatar for Larry_Parnell
8.9 years ago by
Larry_Parnell16k
Boston, MA USA
Larry_Parnell16k wrote:

I'd be inclined to use the approach outlined in a recent paper entitled "The Predicted Impact of Coding Single Nucleotide Polymorphisms Database." The group used three computational tools—Grantham matrix, Polymorphism Phenotyping (PolyPhen), and Sorting Intolerant from Tolerant (SIFT) algorithms. Their Predicted Impact of Coding SNPs database is available at http://www.icr.ac.uk/cancgen/molgen/MolPopGen_PICS_database.htm and is an ongoing project that will continue to curate and release data on the putative functionality of coding SNPs.

ADD COMMENTlink written 8.9 years ago by Larry_Parnell16k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1552 users visited in the last hour