I was searching for the list of essential genes in human. I found the http://www.essentialgene.org/ which has 118 genes for human in its database. However, I am sure that the number of essential genes in human is much more that 118. Does anyone knows another database of essential genes? Also if you know any for organisms other than human, I would b e happy to hear about. thanks.
Hi, Here is some other references for Human, Worm and Arabidopsis.
CEG (Cluster of Essential Genes) is a database containing clusters of orthologous essential genes developed by CEFG Group in UESTC. Original data for generating CEG are derived from the database DEG, which has been published in NAR in 2004 and 2009. Different from DEG, CEG database store essential genes in the form of orthologous groups and not in single genes. The current version contains 16 species:
Understanding the biology of C. elegans relies on identification and analysis of essential genes, genes required for growth to a fertile adult. Approaches for identifying essential genes include several types of classical forward genetic screens, genome-wide RNA interference screens and systematic targeted gene knockout. Based on most estimates made from screening results thus far, from 15–30% of C. elegans genes appear to be essential. Genetic redundancy masks some essential functions and pleiotropy of many essential genes poses a challenge for a full understanding of their functions. Temperature sensitive mutations are valuable tools for studies of essential genes, but our ability to analyze essential genes would benefit from development of new tools for conditional inactivation or activation of specific genes.
Essential Genes in Arabidopsis Seed Development : This project deals with genes that exhibit a seed phenotype when disrupted by a loss-of-function mutation. The updated database (December, 2010) includes 481 genes and 888 mutants. More than 60% of these mutants have been analyzed in the Meinke laboratory at Oklahoma State University. Recent additions not included in the database are listed at the Supplemental Gene Dataset link on the Access Page.
Genes subject to LOF (loss of function) may allow you to infer genes that are not necessary to reach adulthood. The 1000 Genomes project has allowed LOF genes to be found. Unfortunately, there is little phenotypic information about the 1000G subjects. Perhaps what appears as a healthy adult with a given gene deleted or without function at both copies, indeed has poor vision or poor sperm quality - both minor phenotypes that would not seem out of the ordinary in a population of individuals (many people wear glasses or are childless, for example). This gets at the question of what is truly essential.
So, while I have not given you a source where you can find a given list of genes, I think the LOF genes from 1000G provides some material for real thought on this topic.
It is more of a tool for testing the "completeness" of genome sequencing projects, but does so by testing for the presence of core genes. There is a cool paper about the pipeline here, and a subsequent study it was used on here and here. They identify 458 core proteins across a wide range of taxa, of which 248 are the most highly conserved and can thus be found in even lower coverage (~2X) genomes.
There is also the COG (Clusters of Orthologous Groups of proteins) database at NCBI (also containing the Eukaryote specific clusters or KOGs - which were used as the basis for the CEGMA development, along with some COGs). Papers for the COG database are here: