This may be less technical than most questions here but it's something I've been wondering about for some time.
I have studied biology and to a lesser degree bioinformatics (one Python bioinfomatics course). It seem to me the holy grail of bioinformatics would be to plug in a sequence and be able to predict all the proteins in the organisms and how they would be expressed. Basically the complete inter-conversion of phenotype and genotype in silico, which is of course physicality possible but technically challenging.
My question is to what extent this is possible today and what are the biggest hurdles to accomplishing this. Realistically I'm thinking about less impressive comparisons, can we compare the sequence of a black Labrador and a white Labrador and confidently say which genes or promoters are responsible for the difference in color? Obviously when we know which metabolic pathway to check it makes it easier but how difficult is it to subtract one genome from another and then assign the phenotypic effect of each of those differences?
It seems to me that machine learning and other clever techniques have been implemented to solve problems like image recognition, natural language processing and other problems where the inputs are significantly less friendly to computer processing than sequencing data is.
Is it an issue of computing power, programming, insufficient sequencing data or something else?