The Relationship Between Protein Abundance And Protein Function
8.7 years ago
Hi, I got a list of proteins with quantification information. And I want to analyze the relationship between protein abundance and protein function.
It seems that lowly expressed proteins have significant regulative functions while highly expressed proteins are just components or functionally unknown.
Is there any existing theory about this kind of relationship?
And which method or model should I use to verify my hypothesis?
Any suggestions? Thanks!

This is not exactly what you were asking for, but there is some literature on the fact that highly expressed proteins are also more conserved. Maybe you can start from Drummond et al 2005 and check all the literature that cited it.

I think that for your analysis, you should take into account a few difficulties:

  • The techniques used to detect protein expression do not work well over certain limits. So, independently of the database of gene expression you will use, there will be an over-representation of the genes that are more expressed.
  • Some gene expression database contain data that comes from immortalized or cancer cell lines. For example, ENCODE has been criticized for this. So, if you find that proteins in the cell cycle are over expressed, be careful with the interpretation.
  • You should define more precisely what you mean with "Protein function". Are you referring to a specific GeneOntology class?
8.7 years ago
There is a correlation between gene expression and function, but it is not as pervasive (across all kingdoms) as the correlation between gene expression and nonsynonomous divergence (Ka), which is in contrast to the pattern with synonymous divergence (Ks). That is, genes with lower Ka tend to have higher and more broad expression patterns, and this is true for Arabidopsis as well as bacteria, yeast, worms, mice and humans (Yang and Gaut 2011; Drummand and Wilke 2008). According to Pal et al. (2006), the correlation between gene expression and evolutionary rate is the predominant predictor of rate variation among proteins (this surpasses even direct measures of functional importance (Gaut et al. 2011)).

There are three explanations that I am aware of to explain this correlation, and Giovanni pointed out two of them.

  • First, there is selection for robustness against mistranslation (Drummond et al. 2005).
  • A second explanation is that there would be selection against nonsynonomous changes that result result in suboptimal codons; the rationale being that highly expressed genes need to be translated more accurately.
  • That last explanation is that more highly expressed genes are more functionally important and they experience higher levels of constraint.

Drummond and Gaut are two of the biggest names in the field, so that is a good place to start a literature search. In particular, I highly recommend you read Drummond et al. 2005 and Gaut et al. 2011 (both linked to above), the latter of which is what I referred to when forming this post (specific to plants, but an excellent read).

8.7 years ago
Possibly, this information might be useful for you:

Vogel et al. 2012 Laurent et al. 2010 Vogel et al. 2010 de Sousa Abreu et al. 2009


