I'm aware that science hasn't evolved far enough that we have a good understanding, and accurate computational prediction software for this kind of thing, but I'm guessing there must be plenty of geneticists dedicating their work to finding out how specific variants alter the functioning of the genes. I realise this is a very complex thing cuz there is such an enormous amount of variables involved.

The data I need is how much a particular variant will alter the structure of the protein and/or alter the expression of the gene, and most importantly what effect this has on the functioning of the gene. Like for example, some variants in promoter regions of the gene can vastly increase (or decrease) the expression of the gene, so it ends up producing way more (or less) of the protein so that type of variant helps give clues. And missense variants which insert a stop signal right in a central locus of the gene, thats gonna heavily alter the structure of the protein, possibly rendering it useless. I know there are studies out there, which have determined how a variant of certain enzymes alter serum concentrations of their substrate metabolites. Like CYP2R1 for example, I read studies that found that people with certain variants of that gene need to take 3X as much vitamin D to gain the same levels of the active hormone form of vitamin D as people without the variant. A tool like BeFree could be used to mine that kind of data.

So what I'm wondering is if theres a central database thats dedicated to compiling information of how specific genotypes alter the function of the gene, and in what way they alter the function of the gene. And all of the other information that goes with this, like what implications it has for the transcribed protein to do its job.

Also, I heard about computational software tools, which are currently in primitive stages, but still can provide somewhat accurate info. I'm guessing there is a data source containing the output file for these programs (like with Gaussian computational software, there are some sites you can download the data files that were generated by supercomputers.

I basically need to gather up as much information and variables as I can get which help me make better guesses/estimates about how a SNP (and groups of related SNPs like haplotypes, genosets, and SNPs involved in gene-gene interactions) alters the ability of a gene to do its job. And in what way it alters that. Anything I can use to build my own algorithms that can help me make more accurate guesses and predictions about what a a variant, or group of variants mean for the genes ability to function. Are there any databases that are created specifically for this?

