There are many algorithms available online that let you assess the potential effects of a SNP or an amino acid substitution on protein function (phenotype). Some are based on sequence homology alone, some include structure and others use machine learning algorithms to include many different variables.
Which one/ones do you prefer, in terms of scientific basis, ease of use and scale of data handled etc.?
The ones I am aware of are
1. SIFT
2. Polyphen
3. MAPP
4. alignGVGD
5. panther
6. pMUT
7. SNPs3D
I agree - also I think methods that use structure based prediction use it in addition to and not instead of sequence conservation - but I am not sure. By the way, Are you sure SIFT works with non-coding seqs? I think SIFT bases its predictions on amino acid conservation instead of nucleotide conservation.
No SIFT still uses precalculated data for coding-regions and also these tools are trained using snps that are known to be disease causing. So as we get more conservation data from other species and as we collect more disease cuasation/correlation for the non-coding human variations we will have tools like SIFT better predict what other non-coding variations may mean.