Some possible features of a 'good' source:
- More than just a list of 'cancer genes'. One abstract with the name of the gene and the word cancer in it does not make it a convincing cancer gene. But a more sophisticated text-mining based approach would be acceptable
- The gene will be annotated as a tumor suppressor or oncogene with additional information on how this classification is justified
- Other relevant annotation such as whether the gene is involved in DNA repair, apoptosis, etc.
- Up to date. A spreadsheet from 10 years ago is less useful than a routinely updated source.
- Free and open source. Although if you know of commercial options please suggest them.
Here are some of the things I have found so far:
- The Wikipedia entries above list categories of oncogenes .. but not all RTKs for example will necessarily act as an oncogene
- The Gene Ontology used to have a term 'Tumor Suppressor' but this has been superseded by the term 'regulation of cell cycle'. A gene involved in regulation of the cell cycle may generally have the potential to function as a tumor suppressor but it would be nice to know which had been demonstrated to do so and how. A combination of GO terms and evidence codes might be acceptable if someone wishes to elaborate on this approach
- UniProtKB has a keyword 'Proto-oncogene' associated with 560 genes and a keyword 'Tumor Suppressor' associated with 631 genes.
- A compilation of cancer gene lists from the 'Bushman Lab'
- The Sanger Cancer Gene Census
- A more empirical approach might involve using patterns of somatic mutation across many cancers to identify likely tumor suppressors and oncogenes. Where tumor suppressors are expected to be characterized by loss-of-function mutations (copy number deletions, nonsense, or missense mutations spread across multiple sites in the gene) and oncogenes would be characterized by recurrent mutation sites (amplifications, mutation hotspots, gene fusions involving a particular gene partner, etc.). COSMIC is already working along these lines but if there are others, please post. The COSMIC Cancer Mutation Census is newer resource that perhaps addresses this question even more directly.
- The Cancer HotSpots Resource performs an analysis that looks for recurrently mutated cancer hotspots by mining tumor sequence data. These hotspots can be indicative that a gene is an Oncogene.
- CHASM-Plus is a nice resource that takes a machine learning approach to scoring driver mutations.
- A variety of older websites that list tumor suppressors: TSGDB, Tumor Gene Database, and TAG
Useful suggestions gathered from below (refer there for more details):
- Network of Cancer Genes
- NCI Cancer Gene Index
- Table of oncogenes and tumor suppressor genes from Vogelstein et al. 2013
- MSKCC Cancer Genes Via Gene Ranker
- TSGene from Vanderbilt
Sources you could mine to develop your own lists:
- ICGC data portal
- TCGA data portal
- MSKCC cBioPortal
- IntOGen, Integrative Oncogenomics
- Broad, Tumor Portal
- Genomic Data Portal
Organizations that are annotating druggable/actionable genes:
- Washington University Genome Institute: DGIdb, DOCM, CIViC
- MD Anderson: PCT
- Vanderbilt: MyCancerGenome
- Dienstmann lab: GDKD
Some relevant posts:
- Web resources to find cancer indication where a given gene is amplified
- Where can I find mutation databases specialized in cancer?
- Where can I find Data sets of cancer publicly available?
- Mutated gene sequence database for breast cancer
- help in obtaining a cancer-related database
- List of known cancer resources
- Learning about cancer from the ground up...
- Sources of information about amplified genes (and overexpressed) in cancer
- Exploring cancer mutation data portals