Question: Database Of Essential Genes
gravatar for pegahtv
8.4 years ago by
pegahtv140 wrote:

I was searching for the list of essential genes in human. I found the which has 118 genes for human in its database. However, I am sure that the number of essential genes in human is much more that 118. Does anyone knows another database of essential genes? Also if you know any for organisms other than human, I would b e happy to hear about. thanks.

database genes • 9.3k views
ADD COMMENTlink modified 17 months ago by Biostar ♦♦ 20 • written 8.4 years ago by pegahtv140

It's worth pointing out that in general, we can only infer which genes are essential for humans by comparative genomics with other model organisms. For obvious ethical reasons: you can't knock out a gene in a human subject to see whether the effect is lethal :)

ADD REPLYlink written 8.4 years ago by Neilfws48k

Hi Neilfws ... is it possible to see this effect using simulation .. i mean knocking out a gene and see the effect?

ADD REPLYlink written 8.4 years ago by Gjain5.5k

I suppose you could make inferences, knowing something about human metabolism and looking at the metabolic networks in which genes are involved. I'd imagine, for instance, that a functional cytochrome oxidase complex is essential for humans. However, it would still be inference, not observed experimental "fact".

ADD REPLYlink written 8.4 years ago by Neilfws48k

Some interesting modelling papers online! Just been having a quick search - and for example.

ADD REPLYlink written 8.4 years ago by Steve Moss2.3k

Thanks a lot ... the paper helps ...

ADD REPLYlink written 8.4 years ago by Gjain5.5k

True, genes cannot be knocked out in humans, but 1000G data has identified many loss-of-function variants (see my response below) that can allow us to uncover the roles of these genes, once deep phenotyping has been done, and whether they are "essential" or not.

ADD REPLYlink written 8.4 years ago by Larry_Parnell16k

That's an interesting idea, but note that you can only detect non-essential genes the way, the opposite.

ADD REPLYlink modified 8.4 years ago • written 8.4 years ago by Michael Dondrup47k

I know. If enough of this analysis were undertaken, one could arrive at a list of (highly confident) essential genes - those, eg, never seen in an LOF analysis.

ADD REPLYlink written 8.4 years ago by Larry_Parnell16k

very very interesting discussions. I really learned alot. and @Steve Moss thanks for the link to the paper.

ADD REPLYlink written 8.4 years ago by pegahtv140
gravatar for Gjain
8.4 years ago by
Munich, Germany
Gjain5.5k wrote:

Hi, Here is some other references for Human, Worm and Arabidopsis.

  • CEG (Cluster of Essential Genes) is a database containing clusters of orthologous essential genes developed by CEFG Group in UESTC. Original data for generating CEG are derived from the database DEG, which has been published in NAR in 2004 and 2009. Different from DEG, CEG database store essential genes in the form of orthologous groups and not in single genes. The current version contains 16 species:

    • Bacillus subtilis 168
    • Staphylococcus aureus N315
    • Vibrio cholerae N16961
    • Escherichia coli MG1655
    • Haemophilus influenzae Rd KW20
    • Mycoplasma genitalium G37
    • Streptococcus pneumoniae
    • Helicobacter pylori 26695
    • Mycobacterium tuberculosis H37Rv
    • Salmonella typhimurium LT2
    • Francisella novicida U112
    • Acinetobacter baylyi ADP1
    • Mycoplasma pulmonis UAB CTIP
    • Pseudomonas aeruginosa UCBPP-PA14
    • Salmonella enterica serovar Typhi
    • Staphylococcus aureus NCTC 8325
  • Understanding the biology of C. elegans relies on identification and analysis of essential genes, genes required for growth to a fertile adult. Approaches for identifying essential genes include several types of classical forward genetic screens, genome-wide RNA interference screens and systematic targeted gene knockout. Based on most estimates made from screening results thus far, from 15–30% of C. elegans genes appear to be essential. Genetic redundancy masks some essential functions and pleiotropy of many essential genes poses a challenge for a full understanding of their functions. Temperature sensitive mutations are valuable tools for studies of essential genes, but our ability to analyze essential genes would benefit from development of new tools for conditional inactivation or activation of specific genes.

  • Essential Genes in Arabidopsis Seed Development : This project deals with genes that exhibit a seed phenotype when disrupted by a loss-of-function mutation. The updated database (December, 2010) includes 481 genes and 888 mutants. More than 60% of these mutants have been analyzed in the Meinke laboratory at Oklahoma State University. Recent additions not included in the database are listed at the Supplemental Gene Dataset link on the Access Page.

ADD COMMENTlink written 8.4 years ago by Gjain5.5k
gravatar for Larry_Parnell
8.4 years ago by
Boston, MA USA
Larry_Parnell16k wrote:

Genes subject to LOF (loss of function) may allow you to infer genes that are not necessary to reach adulthood. The 1000 Genomes project has allowed LOF genes to be found. Unfortunately, there is little phenotypic information about the 1000G subjects. Perhaps what appears as a healthy adult with a given gene deleted or without function at both copies, indeed has poor vision or poor sperm quality - both minor phenotypes that would not seem out of the ordinary in a population of individuals (many people wear glasses or are childless, for example). This gets at the question of what is truly essential.

So, while I have not given you a source where you can find a given list of genes, I think the LOF genes from 1000G provides some material for real thought on this topic.

ADD COMMENTlink written 8.4 years ago by Larry_Parnell16k

This is very interesting. I have to read more papers on it now. Thanks for the information.

ADD REPLYlink written 8.4 years ago by Gjain5.5k

Thanks for the information. Very interesting. I will read about it.

ADD REPLYlink written 8.4 years ago by pegahtv140
gravatar for Steve Moss
8.4 years ago by
Steve Moss2.3k
United Kingdom
Steve Moss2.3k wrote:

Have you checked out the CEGMA pipeline from Ian Korf's lab at UC Davis?

It is more of a tool for testing the "completeness" of genome sequencing projects, but does so by testing for the presence of core genes. There is a cool paper about the pipeline here, and a subsequent study it was used on here and here. They identify 458 core proteins across a wide range of taxa, of which 248 are the most highly conserved and can thus be found in even lower coverage (~2X) genomes.

There is also the COG (Clusters of Orthologous Groups of proteins) database at NCBI (also containing the Eukaryote specific clusters or KOGs - which were used as the basis for the CEGMA development, along with some COGs). Papers for the COG database are here:

Check out the Conserved Domain Database also at NCBI and the KOG browser at JGI.

ADD COMMENTlink modified 8.4 years ago • written 8.4 years ago by Steve Moss2.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1181 users visited in the last hour