This is very interesting. I have to read more papers on it now. Thanks for the information.
I was searching for the list of essential genes in human. I found the http://www.essentialgene.org/ which has 118 genes for human in its database. However, I am sure that the number of essential genes in human is much more that 118. Does anyone knows another database of essential genes? Also if you know any for organisms other than human, I would b e happy to hear about. thanks.
Hi, Here is some other references for Human, Worm and Arabidopsis.
CEG (Cluster of Essential Genes) is a database containing clusters of orthologous essential genes developed by CEFG Group in UESTC. Original data for generating CEG are derived from the database DEG, which has been published in NAR in 2004 and 2009. Different from DEG, CEG database store essential genes in the form of orthologous groups and not in single genes. The current version contains 16 species:
Understanding the biology of C. elegans relies on identification and analysis of essential genes, genes required for growth to a fertile adult. Approaches for identifying essential genes include several types of classical forward genetic screens, genome-wide RNA interference screens and systematic targeted gene knockout. Based on most estimates made from screening results thus far, from 15–30% of C. elegans genes appear to be essential. Genetic redundancy masks some essential functions and pleiotropy of many essential genes poses a challenge for a full understanding of their functions. Temperature sensitive mutations are valuable tools for studies of essential genes, but our ability to analyze essential genes would benefit from development of new tools for conditional inactivation or activation of specific genes.
Essential Genes in Arabidopsis Seed Development : This project deals with genes that exhibit a seed phenotype when disrupted by a loss-of-function mutation. The updated database (December, 2010) includes 481 genes and 888 mutants. More than 60% of these mutants have been analyzed in the Meinke laboratory at Oklahoma State University. Recent additions not included in the database are listed at the Supplemental Gene Dataset link on the Access Page.
Genes subject to LOF (loss of function) may allow you to infer genes that are not necessary to reach adulthood. The 1000 Genomes project has allowed LOF genes to be found. Unfortunately, there is little phenotypic information about the 1000G subjects. Perhaps what appears as a healthy adult with a given gene deleted or without function at both copies, indeed has poor vision or poor sperm quality - both minor phenotypes that would not seem out of the ordinary in a population of individuals (many people wear glasses or are childless, for example). This gets at the question of what is truly essential.
So, while I have not given you a source where you can find a given list of genes, I think the LOF genes from 1000G provides some material for real thought on this topic.
Have you checked out the CEGMA pipeline from Ian Korf's lab at UC Davis?
It is more of a tool for testing the "completeness" of genome sequencing projects, but does so by testing for the presence of core genes. There is a cool paper about the pipeline here, and a subsequent study it was used on here and here. They identify 458 core proteins across a wide range of taxa, of which 248 are the most highly conserved and can thus be found in even lower coverage (~2X) genomes.
There is also the COG (Clusters of Orthologous Groups of proteins) database at NCBI (also containing the Eukaryote specific clusters or KOGs - which were used as the basis for the CEGMA development, along with some COGs). Papers for the COG database are here:
Check out the Conserved Domain Database also at NCBI and the KOG browser at JGI.
It's worth pointing out that in general, we can only infer which genes are essential for humans by comparative genomics with other model organisms. For obvious ethical reasons: you can't knock out a gene in a human subject to see whether the effect is lethal :)
Hi Neilfws ... is it possible to see this effect using simulation .. i mean knocking out a gene and see the effect?
I suppose you could make inferences, knowing something about human metabolism and looking at the metabolic networks in which genes are involved. I'd imagine, for instance, that a functional cytochrome oxidase complex is essential for humans. However, it would still be inference, not observed experimental "fact".
Some interesting modelling papers online! Just been having a quick search - http://bioinformatics.oxfordjournals.org/content/26/4/536.full and http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.0020072 for example.
Thanks a lot ... the paper helps ...
True, genes cannot be knocked out in humans, but 1000G data has identified many loss-of-function variants (see my response below) that can allow us to uncover the roles of these genes, once deep phenotyping has been done, and whether they are "essential" or not.
That's an interesting idea, but note that you can only detect non-essential genes the way, the opposite.
I know. If enough of this analysis were undertaken, one could arrive at a list of (highly confident) essential genes - those, eg, never seen in an LOF analysis.
very very interesting discussions. I really learned alot. and @Steve Moss thanks for the link to the paper.