Question: Scoring whether genes are functional or not based on SIFT and SNPEff results
I've used SNPEff and SIFT on my dataset (yeast species) and now I have functional annotations of my SNPs and InDels. I would like to score each gene for each sample based on these annotations, with the idea of identifying clusters for which some genes could be functional for some strains and not for other. Is this a good idea / possible ? Are there tools/scripts available doing this ?


Please be aware that the results from these algorithms are predictions. You might have a SIFT score saying that a protein will be non-functional, but if you did a biological assay you might find that it works fine. Use them to inform further experiments, by all means, but don't take them as gospel.

Thanks for your answer. Indeed I know that SIFT are only prediction and with no doubt, there are several false positives in the results, however we just want to identify candidate genes in this analysis.

Edit : I wrote true positives earlier, sorry for that.

